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The  shortcomings  of  the  classical  tests  of  homogeneity,  i.e., 
testing  the  hypothesis  of  equality  of  parameters,  have  long  been  known. 
The  only  question  answered  by  such  a test  is  whether  there  is  any 
difference  at  all  among  the  available  populations.  Bahadur  [2], 

Mosteller  [58]  and  Paulson  [6l]  were  among  the  earliest  research 
■workers  to  recognize  this  and  to  formulate  the  problem  as  a multiple 
decision  problem  concerned  with  the  selection  and  ordering  of  k 
popul at i ons. 

In  the  two  decades  since  these  early  papers,  ranking  and  selection 

problems  have  become  an  active  area  of  statistical  research.  There  have 

been  two  approaches  to  these  problems,  the  'indifference  zone1  approach 

and  the  'subset  selection'  approach.  In  the  first  approach,  a single 

population  (or  a fixed  number  of  populations)  is  chosen  and  is  guaranteed 

to  be  the  one  of  interest  with  a fixed  probability  P whenever  the 

unknown  parameters  lie  outside  some  subset,  or  zone  of  indifference,  of 

the  entire  parameter  space.  This  formulation  is  due  to  Bechhofer  [11]. 

Other  contributions  to  this  problem  are  Bechhofer  and  Sobel  [16], 
Bechhofer,  Dunnett  and  Sobel  [14],  Sobel  and  Huyett  [75] , Chambers  and 

Jarratt  [20],  Barr  and  Rizvi  [7l>  Eaton  [26],  and  Mahamunulu  [56].  A 

quite  adequate  bibliography  may  be  found  in  Santner  [69]  and  Bechhofer, 

Kiefer  and  Sobel  [15]. 
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The  second  approach  assumes  no  a priori  information  about  the 
parameter  space.  A single  population  is  not  necessarily  chosen;  rather 
a subset  of  the  given  k populations  is  selected  depending  on  the  out- 
come of  the  experiment.  It  is  guaranteed  to  contain  the  population  of 
interest  with  probability  P , the  basic  probability  requirement  in  these 
procedures.  This  'subset  selection1  formulation  is  due  to  Gupta  [38], 
[Al].  Some  recent  contributions  to  this  aspect  of  the  problem  are 
Gnanadesikan  [30],  Gnanadesikan  and  Gupta  [31],  Gupta  and  McDonald  [AA], 
McDonald  [57],  Panchapakesan  [60],  Gupta  and  Panchapakasan  [45],  Gupta 
and  Studden  [A6]  , Santner  [69],  Huang  [ 5 0 ] , Huang  [ A 9 ] , Gupta  and 
Huang  [A2]. 

The  sequential  and  multistage  aspects  of  the  ranking  and  selection 
problems  have  been  explored  by  Bechhofer,  Dunnett  and  Sobel  [14], 
3echhofer  [12],  Bechhofer  and  Blumenthal  [13],  and  Paulson  [62],  [63], 
[6A]  , [65]-  Nearly  all  of  this  work  in  sequential  and  multistage 
procedures  is  based  on  the  indifference  zone  approach.  Barron  and 
Gupta  [5]  and  Huang  [50]  consider  sequential  procedures  using  the 
subset  selection  approach. 

An  optimum  theory  was  developed  for  the  first  approach  by 
Bahadur  [2],  Bahadur  and  Goodman  [3],  Lehmann  [55]  and  Eaton  [26]. 
Contributions  toward  optimum  properties  of  subset  selection  approach 
have  also  been  made  by  Goel  and  Rubin  [33],  Govi ndaraj ul u and 
Harvey  [36],  Gupta  [39],  Deely  and  Gupta  [25],  Lehmann  [54],  Robbins 
[68],  Seal  [70],  [71],  [72]  and  Studden  [77]. 

The  main  purpose  of  this  thesis  is  to  study  some  problems  using 
the  subset  selection  approach  and  make  some  contributions. 
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Chapter  I deals  with  some  selection  and  ranking  procedures  for 
the  smallest  unknown  parameter  of  k Poisson  populations.  In  Section  1.2, 
a procedure  is  derived  to  select  a subset  containing  the  best  of 
several  Poisson  populations.  In  Section  1.3,  a procedure  conditioned  on 
the  total  sum  of  the  observations  is  proposed.  A different  selection 
procedure  of  the  type  suggested  by  Seal  is  considered  in  Section  1.4. 

In  Section  1.5,  selection  of  populations  better  than  a standard  is 
discussed.  An  application  to  a test  of  homogeneity  is  described  in 
Section  1.6.  Tables  related  to  the  selection  procedures  are  given  at 
the  end  of  this  chapter.  These  tables  give  the  necessary  constants 
to  carry  out  the  procedure  and  also  evaluate  the  efficiency  of  the 
procedure  in  terms  of  the  probability  of  a correct  selection  and  the 
expected  proportion  in  the  selected  subset  under  specified  configura- 
tions of  parameters. 

Chapter  II  discusses  some  results  on  subset  selection  procedures 
for  double  exponential  (Laplace)  distributions.  Section  2.1  deals 
with  some  character i s t ics  and  use  of  this  distribution  as  a model.  In 
Section  2.2,  a selection  procedure  for  the  location  parameters  is 
proposed  and  studied  using  the  subset  selection  approach.  Also 
selection  with  respect  to  largest  location  parameter  using  the 
indifference  zone  approach  is  considered  in  Section  2.3.  Section  2.4 
gives  a discussion  of  selecting  the  t-best  populations.  In  Section  2.5 
a procedure  is  proposed  for  subset  selection  with  respect  to  the  scale 
parameter.  In  Section  2.6,  a test  of  homogeneity  is  given  which  is  based 
on  the  sample  median  range.  The  distribution  of  a statistic  associated 
with  the  procedure  in  Section  2.2  is  considered  in  Section  2.1.  Tables 
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of  the  upper  percentage  points  of  Y = max  (X.  - X ) where 

1 < i < p 1 ° 

Xo,X],’"’,Xp  are  independent  and  identically  distributed  Laplace  random 
variables  with  scale  parameter  unity  are  given  at  the  end  of  this 
chapter. 

In  Chapter  III,  the  subset  selection  approach  is  used  to  the 
problem  of  classification  of  k univariate  normal  populations.  In 
Section  3*2,  two  classification  rules  with  respect  to  the  mean  are 

proposed  according  as  the  k populations  have  (i)  common  known  variance 

2 2 
0 and  (ii)  common  unknown  variance  o . In  Section  3 - 3 , a classifica- 
tion rule  with  respect  to  the  variance  is  given.  These  rules  might 
not  classify  ^ as  any  one  of  the  k populations.  Hence  different  class- 
ification procedures,  with  respect  to  the  mean  and  the  reciprocal  of  the 

coefficient  of  variation,  which  classify  tt  as  at  least  one  of  the  k 

o 

populations  are  proposed  and  studied  in  Section  3*4  and  Section  3.5, 
respect i ve 1 y . 

Chapter  IV  deals  with  some  selection  procedures  for  the  negative 

binomial  populations.  A statistic  of  type  c max  X.  - X.  is  used 

l<j_<k  J ' 

in  Section  4.2  where  X.  denotes  the  number  of  failures  before  the  r.th 

i i 

success  from  the  ith  negative  binomial  population.  In  Section  4.3, 
a rule  based  on  the  same  statistic  as  in  Section  4.2  but  conditioned  on 
IX. , the  total  number  of  observations,  is  investigated.  The  problem  of 
selecting  all  populations  better  than  a standard  is  considered  in 
Section  4.4.  An  application  is  given  in  Section  4.5.  For  k = 2 and 
various  values  of  r,  t and  P , the  tables  of  the  constants  c^(t) 
required  for  the  procedure  in  Section  4.3  are  given  at  the  end  of  the 
chapter. 


CHAPTER  I 


ON  SUBSET  SELECTION  PROCEDURES  FOR  POISSON  POPULATIONS 
1 . 1 I ntroduct i on 

Poisson  distribution  has  been  used  as  a model  in  several  statistical 
problems.  As  early  as  1 898 , Bortkiewicz  [18]  used  it  to  fit  the  data 
pertaining  to  the  deaths  by  kicks  from  horses  in  a regiment.  Poisson 
process  is  used  as  a model  in  many  applied  probability  problems,  for 
example,  for  the  waiting  time, for  arrivals  of  calls  at  a telephone 
exchange,  for  arrivals  of  radioactive  particles  at  a Geiger  counter, 
etc. 

In  this  chapter  our  object  is  to  study  the  problem  of  comparing  k 
Poisson  distributions.  Not  much  work  has  been  done  on  this  problem. 

More  specifically,  we  consider  the  problem  of  selecting  a subset  of  k 
Poisson  populations  including  the  best  which  is  associated  with  the 
smallest  value  of  the  parameter.  Gupta  and  Huang  [1*3]  have  considered 
the  selection  problem  according  to  the  largest  value  of  the  parameter. 
However,  a procedure  of  the  type  proposed  by  them  does  not  work  for  the 
problem  of  selection  with  respect  to  the  smallest  parameter.  Goel  [3?] 
has  shown  that  the  usual  type  of  selection  procedures  do  not  exist  for 
some  values  of  the  probability  P*  of  a correct  selection.  In  this  chapter, 
we  propose  a procedure  different  from  that  of  Gupta  and  Huang  [i» 3 ] for 
subset  selection  which  exists  for  all  P*.  The  rule  is  based  on  a result 
of  Chapman  [21]  who  showed  that  there  is  no  unbiased  estimator  of  the 
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ratio  -r-with  finite  variance,  where  A , A are  expected  values  of  two 

independent  random  variables  X.,  X with  Poisson  distributions,  but  that 
X ^ 

A l 

the  estimator  ■■■ - is  "almost  unbiased". 

'S' 

L. 

Let  '■  , ■Tj , . . . , be  k independent  Poisson  populations,  i .e.  t.  has 

a Poisson  distribution  with  unknown  parameter  A.,  i = 1,2, ...,k.  Suppose 

that  we  take  n independent  observations  X.......X.  from  each  pooulation 

i 1 in  r 

rj,  i = l,...,i^.  A sufficient  statistic  for  A.  is  2'X.j,  lienee  without  loss  of 
generality  we  wi 1 1 assume  the  sample  size  to  be  one.  Let 
1 [i:  - ;'[2]  - •••  - x[k]  be  the  ordered  values  of  the  parameters;  it  is 
assumed  that  there  is  no  a priori  information  available  about  the  correct 
pairing  of  the  ordered  A j- . ^ and  the  k given  populations  from  which 
observations  are  taken. 

Given  any  P'-Cj”*-  P*  < 1),  we  wish  to  select  a non-empty  (snail) 
subset  of  these  k populations  such  that  the  subset  contains  the 
population  corresponding  to  the  parameter  A ^ . with  probability  at  least 
P-',  no  matter  what  the  conf igurat ion  of  A^ , A .....A^  is.  We  use  the 
notation  CS  for  correct  selection  where  CS  means  that  the  selected 
subset  includes  the  best  population.  Therefore  we  are  interested  in 
defining  a selection  procedures  R such  that 


inf  P.  (CS  1 R)  _>  P* 

A eft  — 


(1.1.1) 


where  52  is  the  set  of  all  k-tuples  A_  = (A  j ,A  , . . . .A^) , A.  > 0, 
i = 1 ,2, . . . ,k. 


Let  X j , X2,...,Xk  denote  the  independent  observations  from 
populations  , tt^  ....  ,Trk , respectively.  Let  X^  be  that  value  of 
Xj,...,Xk  which  is  associated  with  Aj.j;  of  course  X^  is  unknown. 
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In  Section  1.2,  we  discuss  a subset  selection  rule  so  as  to  satisfy 
the  basic  probability  requirement  (l.l.l),  and  to  find  an  upper  bound 
for  the  expected  subset  size.  In  Section  1.3,  we  consider  a conditional 
selection  procedure  conditioned  on  the  total  sum  of  the  observations;  a 
method  for  constructing  the  constants  (conservative)  and  an  upper  bound 
for  the  expected  subset  size  are  derived  for  this  conditional  rule. 
Section  1 .k  deals  with  a different  selection  procedure  of  the  type 
suggested  by  Seal  for  the  normal  means  problem.  We  also  discuss  the 
Seal  type  procedure  conditioning  on  the  total  sum  of  the  observation, 
in  which  case  the  selection  r~istant  can  be  determined  precisely  so  as 
to  satisfy  the  basic  probability  requirement.  An  exact  expression  for 
the  expected  subset  size  of  the  conditional  Seal  type  procedure  is 
stated  in  Theorem  l.A.5.  Selection  procedures  for  selecting  a subset 
which  contains  all  populations  better  than  a standard  are  considered 
in  Section  1.5.  An  application  to  a test  of  homogeneity  is  mentioned  in 
Section  1.6.  Tables  related  to  the  selection  procedures  are  given  at  the 
end  of  this  chapter. 

1 .2  The  Unconditional  Selection  Procedure  R| 

(A)  The  Rule  Rj  and  the  Probability  of  Correct  Selection 

Rj  : Select  the  population  Tt.  in  the  subset  if  and  only  if 

X.  < c,  min  x.  + c, 

' ~ ' 1<j<k  J 1 

where  Cj  _>  1 is  to  be  chosen  so  as  to  satisfy  the  basic  probability 
requ i rement  (1.1.1). 


p 


For  i = 1,2 k,  let  p^  ( i ) = P^  (select  population  tt  ^ ^ | R^ ) . 

Theoren  1.2.1.  P,  (i)  is  a decreasinq  function  in  A,.,  when  all  other 

A l • J 

X's  are  fixed  and  p^(i)  is  an  increasing  function  in  X^.j,  j J4  i,  when 


all  other  X's  are  fixed. 


Proof . p^(i)  = P^(select  population  t ^ . j j R| ) 

= p\  (X/-.  \ 1 c,  min  X,.x  + Cj) 


■v  \ A / . x ^ . MINI  A , . \ 

A (•)  ~ ’ i<j<k  0) 
px(x(i)  1 clX(j)  + V j " '.2 k.  j * i) 


m 'Vi  xr-i  k 

= Z e M JL}i{  n Z 

x=0  x'  J"!  £=»< — -1 > 

jtl  Cj 


■X 


[j] 


> 


where  < a > is  the  smallest  integer  > a. 

Xr. 


1 


00  -X  r . i Xr.i  k A[jl 
- £ e 1,1  -1|L  n / - 

x=o  x-  j=i  o r (<  — — i >) 


< — -i>-i 

ci 

“V 

y e dy 


jVi 


’I 


l f(x,  X^j,..,fX^.j,...,X^j)e 


-X  Xx 

xm  x[i] 


= Ex[i]  f(x;  A[1  ]*  — ,x[i] X[k]) 

A k A r . ^ 

where  f(x;X^|j,...,Xj.j,...,X^j)  — IT  J 


(1.2.1) 


-1>-1 


j-i  o r(<  ~ ->i) 

jVi  ci 


e Xdy 


and  a denote  that  a is  delected.  From  (1.2.1),  it  is  obvious  that  p^(i) 
is  increasing  in  X^j,  j i4  i , when  all  other  X's  are  fixed.  On  the  other 
hand,  for  fixed  X[j]»  J ^ ^ is  a decreasing  function  in  x and 

Poisson  distribution  belongs  to  the  S 1 (Stochast ical 1 y increasing) 
family,  so  by  a lemma  on  P.  112  Lehmann  [53], 
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p.(i)  - Ex  f(x;  X P ^ , . . . , X p ^ . ,X ) is  a decreasing  function  in 
Xf.j  when  other  X's  are  fixed.  Hence,  the  Theorem  is  proved. 


Let  .-q  - ^ B (Xj  , A^, . . . , X | ~ A 2 B ... 

Coroi lary  1.2.1.  inf  P.(CS|R.)  » inf  P.(C$|R.) 

Aef2  - 1 AeQ  - 1 

— — o 


A^  - X,  X > 0}. 


inf  £ e"X  V ( 1 

X > 0 x“°  X'  -1> 

cl 


-X  A*  ,k_l 

e rr } 


Proof.  The  proof  follows  directly  from  Theorem  1.2.1. 

It  should  be  pointed  out  that  the  infimum  depends  on  the  common 

unknown  X,  0 < X.  In  Section  1.7,  we  discuss  numerical  methods  to 

determine  this  infimum  and  the  constant  for  the  selection  rule. 

Under  the  parameter  space  , the  joint  distribution  of  .X^,. .. .X^, 

given  £ X.  = t,  is  a multinomial  distribution  with  parameters  t; 

1 X. 

6 , , . . . ,6^,  where  6 = 


1 • • • • *vk’  -j  JX . * ^ = l».».»k,  i.e. 

^ tl  „~1  ~k 


P(X1  " X1 Xk  = xk ^ J Xi  = x, !...xk!'  9I  *•*  9k  (l,2’2) 

Lemma  1.2.1.  For  any  t,  t > 0,  c,  (t)  > 1 and  for  X £ fi 

k 

P,  (X,  < c,(t)  min  X.  + c,(t)  j £ X.  = t) 

- ' ' 2<j<k  J 1 


tl 


‘1 


(1) 

7TT  V 


where  the  summation  is  over  all  x.'s  such  that  x.  < c,  (t)  min  x.  + c.(t), 

k 2<j<k 


x.  > 0,  for  i * 1,2 k,  £ x.  ■ t. 

i - 


Let  A(k,t,Cj (t) ) 


x.<c,(t)  min  x.+c.(t)  X1  Xk 

2<j<k  J 


t!  /I  \ 

r (r 


x{  > 0, 


£ Xj  * t 


(1.2.3) 


ID 


Theorem  1.2.2.  For  given  P*,  any  t,  t 0,  let  Cj  (t)  be  the  smallest 
non-negative  number  such  that  A(k,t,cf(t))  >_  P*.  If  = sup  {Cj  (t)  : 


t > 0},  then 


inf  P,  (CS | R. ) > P*. 
XeW  - 1 ~ 


Proof . For  X.  c QQ,  P^(CS|  ) = P^(X  ^ ^ <_  min  X^  + Cj ) 


1 PA(X(, ) £ c,  (t)  min  xQj+c,(t)) 


E min  xrs\+cl  (t)  Is  X.=t) 

t=0  - U'  2<j<k  1 1 


■PA(S  X.  = t) 
- 1 ' 


E A(k,t,c.  (t))  P. (Z  X.  - t) 

t=o  A ] ' 


Thus,  the  Theorem  follows  from  Corollary  1.2.1 


(B)  An  Upper  Bound  on  the  Expected  Subset  Size  Associated  with  R, 


Let  S denote  the  size  of  the  selected  subset,  then  S is  a random 
variable  taking  values  l,2,...,k.  Let  us  consider  the  special  case 
X^jj  ■ 6X,  6 < 1,  X^j  = •••  = X^j  = X,  X > Xg  > 0,  and  denote  the  space 
of  all  slippage  configurations  of  this  type  by  . We  discuss  the 
expected  subset  size  as  follows. 


Theorem  1.2.3.  En  ( S | R. ) < k - | inf 

H ' “ U>[c,]+1 


nf  g (t)  + (k-l  ) inf  h (t) 
,]+1  t>[c,]+l 


(1+6)X 


0 1 


[cl]  -y 
y e dy 


n 


where 


TC71 

g(t)  . E (')  (Tl^)s  (l^)'-s 

s=0 


tv  / <5  v S , 1 vt-S 


Proof . For  A e ft. , 


Ex(slRi>  * pi(x(i)-<:i2"!^x(j)*:i)t,k'l,pi(x(k)ieil^k.,x(j)+ci) 


k'Vxo)>ci  2"^k  x(j)tci)-<k-')px(x(k)>ci  x(j )+ci ^ 


< k - Px(*(|)  > c,  x(2)  * e,)  - (k-l)  Px(X(k)  > c,  X(|)  * c,) 


k " ”i(X0)  ’ C1X(1)*C1  I X (1  )+X(2)  ’ t,VX0)+X(2)"t) 


<k"11  i Pi<XW  * cl*0)4clix(l)+xWt)px,x(,)+x(k)-t) 


k ‘ tfoPi‘X<2)  k 1^7  ' *0 )+X (2)  ■ k)  Px,x(l)«(2)  - '> 


- (k-D  ^yx,,)  < 1 x(1)«(k)  - o pA(x(1)«(k)-  0 
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ciTrr] 

k - i i o a.)5  (T|.)t-s  e-(,+6)A  uimi 

t-[C)]+1  s-0  5 1+6  1+6  t! 


- (k-1)  z 


Z Z (\) 

t-[C,]+l  s-0  s 1+6  W *■ 


where  [a]  is  the  greatest  integer  < a. 


<_  k - inf 
t>[c,]+1 


g(t)  I .-(!«)>  liwmil 


t-[Cj]+1 


- (k-1)  inf  h(t)  Z 

etc  ,1+1  t-[c,]+l 


-(1+6)X  [ (l+6)X]t 


inf  g (t)  + (k-1)  inf  h(t)  / 
^-t>[c,]  + l t>(c,]  + I -J  0 


( I +6 ) X 


< k-  i 

” Lt>rc 


(1+<5)Xf 


nf  g (t)+(k-l ) inf  h(t) 


t>[c.]+l  t>[c,]+1  J 0 


1 tc1]  -y . 

n^rrry  e dy 


1 IclJ  -y , 

-mr  y e dy 


This  completes  the  proof. 


1 . 3 The  Conditional  Procedure 
Rj1  Select  the  population  tt.  in  the  subset  If  and  only  if 


X.  < c„(t)  min  X.  + c0(t),  given  Z X.  = t 
1 - *■  1 < j <k  J z 1 1 


(1.3.D 


where  c >_  0 and  C2(t)  ^ 1 is  the  smallest  non-negative  number  chosen  to 
satisfy  the  basic  probability  requirement  (1.1.1). 


For  this  rule  we  obtain  an  exact  result  for  k - 2 in  Theorem 
1.3.1.  For  k ^ 3.  we  have  a lower  bound  for  the  probability  of  a correct 
selection  In  Theorem  1.3.4. 
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(A)  Property  of  the  Rule  R^ 

A monotonicity  property  of  the  rule  Is  discussed  In  the  follow- 
ing theorem.  As  before,  let  p^(i)  denote  the  probability  of  selecting 
population  using  rule  R2* 

Theorem  1.3.1.  for  i < j , we  have  p^(i)  P^  (j ) • 

Proof.  S i nee 

px(!)  = P^(select  population 
' ~ k 

= PX(X(i)  -C2(t)  X(£)  + C2(t)  I V)  = 0 


X^,...,X.,...,X.,..,,X^  c^  (t)  (x  J +X j ) +Cj ( t ) 

x.£c2(t)  min  x^  + c2(t)  Xi  — l+c2(t) 


( ' J)  (' 


) ' ( ^ 
• ' n _ 4-n  _ _ • 


p[i]+p[j]  p[l]+p[J]' 


( A.  A.  ) 

^|  » • • • »^J  » * * * »*j  t • • * tX|^»  Xj+Xj 


X1  /vX;  ^x.  x|^  x.+x. 

p[i]  •••  p[i']  •••  p[j]  •••  p[k]  (p[i]+pui) 


where  pr.- 


and  x.  denote  that  Xj  is  deleted.  Note  that 


when  x.  and  x^.  are  interchanged,  the  second  part  in  the  above  summand 

remains  unchanged,  and  Binomial  distribution  belongs  to  the  SI  family, 

pr  i 1 

hence  p,  (i ) is  decreasing  i n — . So,  i f i < j , we  have 


hence  p,  (i)  is  decreasing  in  — 

! P[i]+Pfj] 

P^(i)  > P^(j)  • 


I 


(B)  The  Probabi 1 i ty  of  a Correct  Selection  for  R„ 


Theorem  1.3.2.  For  a given  P*,  £■  < P*  < 1 , k = 2 and  any  t ^ 0,  let 
c2(t)  be  the  smallest  value  such  that 


c,(t)  (1+t) 
Pfi0(Xl  - 1 + c2(t) 


X1  + x2  = t)  > P*  . 


Then,  inf  P, (CS | R_ ) 
Xcfi  - 2 


inf  P,(CS|R,)  > P* 

Xef20  - 


Proof.  For  X_  c ft, 

P^(CS|R2)  = p^(x(, ) 1 c2^t)X(2)+c2(t)lX(l)  + X (2)  = t) 

c (t) (1+t) 

- MXm  1 A-,"T7V-  I X,lX  + X,„x  = t) 


c2(t) (1+t) 


VX(D  - r— ,TtV~  I X(D  * x(2) 


c,<t)<l+t> 

r A _ 


+ c,(t!  ^ , X , 

* <x>  <r^V->  " - r - J'}  ) 

T Ar^-i  Aft!  ^ Ar«1 


'[1]  [2] 


l[1]  [2] 


For  fixed  X^j, 


1 

A[i] 


increases  with  X^j  to  y , and  since  Binomial 


distribution  belongs  to  the  SI  family,  so  the  right  hand  side  of  the 

X[l]  1 

above  expression  decreases  as  < ■ <-  increases  to  the  value  . 

X[i]+X[2]  2 

Hence,  inf  P,  (CS | R0 ) = inf  P.(CS|R_).  Thus  complete  the  proof. 

Xeft  - 2 XefiQ  - 2 

For  k >_  3»  we  need  the  following  definitions  in  order  to  discuss 
the  least  favorable  configuration  of  the  probability  of  a correct 
selection  of  the  conditional  rule. 


1 la2  V bl  - 

b-  > 

• • • 

> b 

2 — 

— 1 

m 

m 

1 ) 3 n d E 

a = 

t 

b;, 

1*1 

i 

i«l 
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£ = (a, . a2,...,am)  is  said  to  majorize  b_  * (b,  , b2,...,bm),  written 
ai  y or  equivalently  b_^  a_. 

Definition  1.3.2.  If  a function  satisfies  the  property  that 

f(x)  ((f(x_)  whenever  x.  ^ £,  then  ^ is  called  a Schur-con- 

cave  (Schur-convex)  function. 

We  need  the  following  lemma,  due  to  Rlnott  [67],  which  is  stated 
below  without  proof. 


Lemma  1.3.1. 

P(X_ 

where  x_  = (xj , 

Let  <$>  (xj , 
funct ion . 


Let  X = 


*=  x)  *=( 


,xk) 


have  the  multinomial 


m k x . 

N ) n e. ' 

xi xk  i=i  1 


k k 

, Z x.  = N,  and  Z 0.  = 1 . 
i=1  ' 1=1  1 

be  a Schur  function.  Then  Eg<f>(X_) 


di str ibution 


is  a Schur 


Le  t (X  | »•••»  Xk  )•  ^v[1]  •••  = ^[k"l]  X , 

= X',  0 < X < X'}. 

Theorem  1.3-3«  inf  P,  (CS| R-)= i nf  P.(CS|R  ) 

XeQ  - 2 Xefi  - 2 

2 k 

Proof.  P^(CS|R2)  = P^(X^  £ c2(t)  min  Xq^  + c2(t)  | Z X.  = t) 


t t y,  t-y. 

£ <!  ) p,'  o-p,)  ' 

y,=o  vi  ' 1 


E( 


t-y,  k p 7j 
) n (t-A-) 


y2'**',Yk  j=2  ,_P1 


where  p.  = — — 
i k 


l[i] 


Z X 

j = l 


[j] 


l,...,k  and  the  summation  is  over  all  those 


y,  - c2(t) 


y2,...,yk  such  that  Vj  > — - , j - 2, . . . ,k  and  Z yj  = t-y] 


Let 
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V(y2 Vk)  3 { 


1 if  min  y . > 
2<i <k  ' 


otherwi se 


c2  (t) 


c2(t) 


then  P^(CS!R2)  can  be  written  as  Ey  [E[<J^  (Y2 Y^)  | Y^  = y^]]  where 

the  joint  distribution  of  Y,,...,Y^  is  a multinomial  distribution  with 
parameters  t;  It  is  easy  to  see  that  for  a fixed  y ^ , 

(y  , . . . ,y  ) is  a Schur-concave  function.  By  Lemma  1.3.1., 

y'  p2  pk 

E [0  (Y_ .....  Y.  ) Y,  = y . ] is  a Schur-concave  function  i n , . . . . 

y1  2 k 1 1 1-pj  l-p1 

Now,  since  p^  Pj » j = 2,...,k  and  p2  + . . . + p^  = 1 - p^ , we  have 


1 "Pi 


pk  i pi  pi  1“ (k-1 )p. 

T^p-’  — ~L' 1 ■■  )•  Hence  (CS | R0 ) is  minimized 


1-p 


when  p1  * p2  = ...  = p^_j  and  p^  = 1 - (k-1)  p^ , or  when 


[1] 


= ...  = = = A"  > A.  Thus  the  proof  is  completed. 

Under  the  parameter  space  ^2>  the  joint  distribution  of  Xj,...,X^ 
k 


given  E X.  = t,  is  a multinomial  distribution  with  parameters  t; 

i = 1 ' A 

P1 pk;  where  P1  " " Pk-1  “ (k-1 ) A+A'  P> 


Pk  = (k"-V}"A+A'  = q*  p < q*  i,e-* 

k 

p^(X,=x,,...,Xk=  xk|  ^ X.  = t)  = x^,- — — xj 


t! 


k-1 
E x. 

1 1 xk 

p q 


Theorem  1.3.4. 


inf  P, (CS | R ) 
Aeft  - 2 


t ! 1 t k 

inf  ^ TT. — >TT  ^ a^  ^X^ 

0<A<A'  x, <c_ (t)minx.+c, (t)  1 k k-1  + t— 

2 jf»l  J 2 A 

x,  > 0,  Ex.  = t 
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Proof.  For  ^ e fij. 


Px(CS|R2)  - V*(1)  £c2(t)  min  X(j)  + c2(t)  | Z X.  = t} 


ji*l 


i = 1 

X 


E — r [- 

x.<c_ (t)minx.+c_ (t)  X!  ‘‘^k-  (k-])X+X 

2 jVl  J 2 
x.  > 0,  Ex.  = t 


k-1 

E x 

1 


7^  f(k-0x+x^ 


The  Theorem  follows  from  Theorem  1.3.3*after  simplification. 

jl  1 -n* 

Theorem  1.3-5.  for  k >_  3,  and  for  any  P*,  let  P"  = 1 - j^Tj  * 0 r <_  t, 
let  c2(r)  be  the  smallest  value  such  that 


Po  (X,  1 c2(r)  X2  + c2(r)  I X,  + X2  - r)  > P*  . 
If  c2(t)  ■ max  {c2(r):  0 <_  r <_  t } , then 


inf  PX(CS|R  ) > P* 

Xefl  — 

Proof.  For  X e fi, 


Pa(CS|R2)  - PA(X())  £Cj(t)  ^ X(J)  * c2(t)  | |£)  x,  - t) 

i 1 - jf2  VX(1)  ” C2(t)  X(j)  + C2(t)l  Xl  ■ l) 


k . 
E 

j=2 


l - E [1  - PX(X())  < c2(t)X(j)  + c2(t)  |.Z  X.  = t) 


k k 

2 - k + E PX(X(1)  ic2(t)X(j)  + c2(t)l  .S,  Xi  = 


J=2 


i =1 


now 


, we  note  that  for  fixed  j (j  = 2,...,k), 
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k t 

px(xm  i C?(t)xm+S(t>l  z x;“t)  - 2 Mxm  < c,(t)  x 


XVA(D  - "2  ""'0)^2’-  • ' . i 

— i = 1 


r.0  r (>)  - 2vi'  ao) 


+ C2 (t)  1 X (1  )+X (j  )"  r)  •PX(X0)+X(j)“  r) 
t c (t)(l+r) 

rf„  VX(D  1 1-r^TtT  I *0)«(j)  ■ r>  px»(.)  +x0)  * r) 


c2(t)  (1+r) 


t [ r+  c-to1 


z z 

r=0  s=0 


<S»  (F^pJ>r  S PX(*(1)  * x(j)  ■ r) 


where  p 


[l] 


£ k 


but  Binomial  distribution  belongs  to  the  SI  family, 


Z Xr., 

• , [ i ] 

i *=  1 


so  infimumn  of  the  expression 


c,(t) (l+r) 

t K—  Ml 

z 

s=0 


P,  s p.  r-s 


(r)  ( ! ) ( 1 ,) 

5 pl+pj  pl+pj 


takes  place  when  p^  = Pj » '*e*»  when  X^j  = ^[j]*  Hence» 

S PA(X(,)  AC2(t)X(j)+C2(t)l.f,  Xi  = t}  = VX(1)  iC2(t)X(j) 

k 

+ c,(t) | Z X.  = t) 
i = l 1 

and  for  X_  e fig. 
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PA<X(1)  ±C2(t)  X(j)  +c2(t)  If  Xi  " t) 


= T_  PA(X1  lc2(t)  X2  + c2(t)lx,  + x2  - O-P^X,  + x2 


r=0  - 
t 

> Z 

r=»0  — 


pA(x,  1 c2(r)  X2  + c2(r)  | X,  + x2  - r).P(X,  + X2 


= r) 


= r) 


> P2* 


Thus,  we  have  the  result. 

Hence,  for  each  k and  P*,  Theroem  1.3. 5.  guarantees  the  existence 

k 

of  c2(t)  and  gives  a method  to  find  c2(t)  for  given  Z X.  = t such  that 
P}  (CS  | R2 ) >_  P*  for  any  A_  e fi  . 


1 


(C)  An  Upper  Bound  on  the  Expected  Subset  Size  for 

For  the  procedure  R2»  the  subset  size  S of  the  selected  subset  Is 
a random  variable  which  can  take  on  only  integer  values  from  1 to  k, 
inclusively.  For  any  fixed  values  of  k and  P*,  the  expected  size  of 
the  selected  subset  is  a function  of  the  true  configuration 
A.  = (Aj,...,Ak).  Now,  consider  the  special  case,  A^j  = 6A,  6 < 1, 
A[2]  = •••  = X[|<]  = A,  A > 0.  Let  us  denote  the  space  of  all  slippage 
configuration  of  the  type  discussed  here  by  Sly  We  investigate  the 
expected  subset  size  as  follows: 


r-c-(t) 

t H*c2(tjH 


Theorem  1.3.6.  (S|R„)  < k - Z Z 

^ z s=0 


O t6"  + (k-l)6S} 


s 
t-r 


(f ) ‘ t 

(k—  1 + 6r 
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Proof . 


(S‘R2^  = V"(l)  !s  se,ected)  + £ Px(Tr(  ) Is  selected) 

k 

= Vxm  < c7(t)  m,n  xm  + c,(t)  I Z X.  - t) 

A “ 2<J<k  (j)  2 1=1  ’ 

k 

+ (k-1)  P>  (X  /.  \ < c,(t)  min  X,..  + c (t)  | £ X,  = t) 

A (k)  " 2 1 < j <k— 1 2 i = l ? 


k " PX^X(!)  > c2^  min  X(;\  + C->M  I J X,  = t) 


2<j  <k 


(j)  2 


■ (k-1)  P,(X,  * > c,(t)  min  X,.>+c,(t)|  £ X.=t) 

- (k)  2 l<j<k-l  (j)  2 1=1  1 

k 

< k - PX(X(])  > c2(t)X(2)  + c2  (t) | £ X.  = t) 

k 

- (k-1)  px(x(k)  > c2(t)  X^  + c2  (t ) | £ X j = t ) 


P,  (£  X =t) 

A , 1 


V*{1)  ' C2m  * (2)  + C2 


P,  X = t) 

A 1 1 


V*(k)  ? C2u;  *(1)  + C2 


Now,  we  note  that 


px(X(1)  > c2(t)  X2(t)  + c2(t),  £ X,  = t) 


rfo  PA(X(1)  > C2(t)X(2)+C2(t)*  X(1)+X(2)  = r*  2 X(D  ~ ^ 
rl0  PA^X(D  > C2(t)X(2)+C2(t)’ 

X(1  )+X(2)*r  1,,^  X(i)’t"r*'PX(|),fi2  X(.)  ‘ '-r) 


r.O  Pi*X<')  * c2*t)X(2)+c2*t'’  X(l)'fX(2)"r*'V.I,j:  2 X(l)“t"r’ 
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since  the  event  { I X,.,  = t-r}  is  Independent  of  the  event 

1*1,2 

<*(1)  » c2(t)  X(2)  * C2(t)’  X(l)  * X (2)  ‘ rK 

• E PX(X<1)  > c2(t,X(2)*C2(t)|X(l)*X(2)"r)-PX(X(l)+X(2)'r) 
r=0  — — 

•P  ( l X \=t-r) 

A i *1 ,2 

k 

and  a similar  expression  for  P^(X^)  > C2  (i  )+C2^^ ' ^ X|  “ So’ 


(S|R2)  < k pi I PX(X(1)  > c2(t)X(2)+c2(t)|X(1)+X(2)-r) 

‘3  Px(  E x.-t)  r=°  “ 

• px<xm+x(2)-r)-px(,i  xo)-t'r) 


X.'  (1 ) (2)  ' i'ift,>2 


P,(  x X =t) 
A i=i  • 


ZnPX(X(k)  >c2(t)X0)+C2(t)!X(l)+X(k)’Er) 
r=0  — 


• Px(xO)”(k)-r)-px<,ili:ik  x(0't-r) 

t r - c_(t) 

— c E PX(X(2)  < f + c-(tl  I X (1  )+x (2)”r> 

P ( 2 X.-t)  r“°  ~ 2 

A i“1 

• PX<X(l)W(2)-r,-PX(i,IEi2  X<i)-t'r) 


. , t r-c  (t) 

— r1 * px(xo)  < r^TtT  l xo)*  xWr) 

px< £,  xf‘>  " 


PX(X(l)+X(k)“r)'PA(;i  , X(i)=t‘r) 
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' — c2 ( t ) 

„ t '^7H  r , , 4 r.. 

r«0  Jo  <s)  <TO>  W 

[Q+<5)A]r  e~(U6)X  [(k-2)A]t~r  e~(k~2)X 

r!  (t-r) ! 


(k-l)ti 

[(k-l+6)A]V(k' 


r-c  (t) 

« [^"  r 4 s , r-. 

*Jo  s=0  <s)W  W 


[ (1  -*-6)  A] r e~(1*6)X  . [ (k-2)A]t~r  e~(k~2)X 


After  simplifying,  we  have  the  result. 


l.k  A Different  Selection  Procedure 


00 


[»]  x 1}  Jj 


l2  x[j] 


T > 


r(<(k-i)  (—  -D>) 

c3 


<(k-l)  ( 1 )>-l 

3 -y 

y e dy 


The  proof  follows  by  observing  that  the  expression  in  the  square  brackets 

k 

is  a monotone  increasing  function  of  £ A.,-.. 

j=2  LjJ 

By  using  an  analogous  argument  as  in  the  proof  of  Theorem  1.2.2., 
we  have  the  following  theorem. 

Theorem  1 .4.2.  For  any  P*,  any  t,  t _>  0,  let  c^(t)  be  the  smallest  value 
such  that 


(k-1 )c^(t)+tc^(t) 
k-1  + c^(t) 


r 1 ' k-1  ' 

(f)  <r»  <Tr» 


If  c,  = Sup  (c,(t):  t > 0},  then  inf  P,(CS|R_)  > P*. 

3 3 “ XeQ  - 3 “ 

Consider  the  special  configuration  A^  = 6A,  6 < 1; 

Aj2j  « ...  * A^j  = A,  A > Aq  > 0.  Using  the  same  notation  as  in 

Section  1.2,  the  space  of  all  such  slippage  configuration  is  denoted  by 

. In  the  following  theorem,  we  give  an  upper  bound  for  the  expected 

subset  size  S. 

Theorem  1.4.3.  En  (S|R,)  < sup  g(r)  + (k-sup  g (r))(1  + (k-1+6)A^)e  ^ 

“l  i r>2  r>2 


where 


c, (r+k- 1 ) 

i-her* 


<(s)(CT5r)S(l  'CTp/  S 
♦ <k-|>(s)(CT7T)  (l  ' CT+T*  ] 


- p< »(,)  i<=3  j^2l<o))*(k'')Kxw-c3',^rj^  x(j>5 

' ”(X0>-‘3  *k^jf2X<j)li”I'r!’’ti",  Xi'r) 

* <k"n  lPiX(k)i'3  *CT^X(j)li,Xi'r,P(,^l  Xi'r> 


c,(r+k-l)  k 


r^o  P{X0)i-^ 


Z X.  = r}  P{  Z X.  = r} 


c,(r+k-1)  k 


+ (k-1)  Z P{X(  } 1 g -Hc-1"-  I E Xi  = r]H.Z . Xi  = r} 
r«=0  ik;  c3  1-1  1-1 

Cj  (r+k-1 ) 

to  c^+k-l  ^ s , r-s 

= 2 E {(s)(k^TfT)  ^ “FT+E^ 

r=0  s=0 

+ (k’,)(s)(FT+^)  {,'rr^‘)  } 

- (k-l+6)X  [ (k-l+6)X] r 


< k{e"(k',+6)X  + e'(k"!+6)X(k-l+6)X}  + sup  g(r) 


- (k-l+6)X  j(k-1+6)X] 
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” sup  g(r)  + (k-sup  g(r))  {e  '+l^+e  ^ '+^X(k-l+6)A} 
r>2  r>2 

<_  sup  g(r)  + (k-sup  g(r))  [1  + (k-l+6)A  ] e ^ '+<^Xo 
r>2  r>2  ° 


The  proof  is  completed. 


(B)  A Conditional  Procedure  R. 


We  also  consider  a conditional  rules  as  follows. 

R^:  Select  the  population  tt.  if  and  only  if 

c.(t)  k 

X.  < c,  (t)  + — r— : — E X.  given  E X.  = t . 

4 K_l  jjM  ' 1-1  ' 

k 

We  know  that  Xj,...,X  given  E Xj  = t is  distributed  as  multinomial 

i = 1 

X1  \ 

with  parameters  t;  E X.,...,  E X.,  and  the  marginal  distribution  of  X. 


J-l  J'”  J-1  J 


1 

X 


given  E X.  = t is  binomial  with  parameters  t and  E X.. 


i =1 


J-l 


J 


Theorem  1.4. 4.  inf  P % (C S i R, ) = inf  P ^ (C S | R, ) 

Aef3  - 4 AeC  - 4 

— — o 

Proof.  For  X_  e fl, 

c.(t)  k k 

pa(cs|r4)  - p*{x(1)  < c4(t)  ♦ -gq-  }l2  X(J)  | |I1  X. 


t) 


PX{X(1)  < 0(t)  I ^ X.  - t> 


c^(t)  (t+k-1 ) k 

where  0 ( t ) = c 1 • Since  X^  given  E X.  = t is  distributed 

X[l] 


as  B(t,  E X...)  which  belongs  to  the  SI  family,  hence 
j-l  UJ 
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I 


[ 

■ 


( 


; 


C°(t)]  , i k,  t-i 

inf  Px  (CS|  R.  ) - inf  PX(CS|R.)  = L (J)(J)  (~)  . 

Aefi  — Aeft  — i =*0 

— — o 

Note  that  the  infimum  of  the  probability  of  a correct  selection  is 
independent  of  the  common  value  A and  c^(t)  is  the  smallest  constant 
determined  from  the  following  inequality 


t i 1 k-i  t_i 

i -p  • 

i=0 

[D (t) ] r t 6 S k-1  t_S 

Theorem  1 . ^ . 5 . ( S | ) = I i (*)  (k_^'+6^ 

1 s=0 

+ (k_,)  (s)(i^W)  {PT^)  } 


c4  k 


Proofs  E^tSlR^)  = PX{X{1)  <ck  + ^ X(j)|  ^ X, 


t} 


c.  (t)  k-1  k 

* (k-i)  p<x(k)  <ckM  ♦-prj“(j)l£  xi  ■ t! 


■PX{X(|)  < D(t)  | E X,  • tH(k-l)P(X(|<)<p(t)|^X.-t} 


The  theorem  follows  easily. 


1 . 5 Selecting  a Subset  which  Contains  All  Populations 
Better  Than  a Standard 

In  this  section,  we  discuss  a related  problem. 

Let  ttq,  TTj,...,iTk  be  k+1  independent  Poisson  populations  with 
parameters  Aq,  Aj,...,Ak  respectively.  We  use  the  same  notation  and 
definitions  as  in  Section  1.2.  Population  Is  said  to  be  better  than 
the  standard  if  X.  < X . The  procedure  described  in  this  section  control 
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the  probability  that  the  selected  subset  contains  all  those  populations 
better  than  the  standard  and  guarantees  this  probability  of  such  a 
correct  decision  to  be  at  least  P*.  Let  Xj  denote  the  observation  from 
tt.  (i  «=  l,2,...,k).  Let  r^  and  r^  denote  the  number  of  populations 
with  X <_  Xq  and  A > A^  respectively.  We  discuss  the  following  cases: 
Case  (1):  Known  Standard 

We  assume  Aq  is  known,  and  propose  a procedure  as  follows: 

R,  : Retain  in  the  selected  subset  those  and  only  those  populations  n. 

d, 

for  which 


X.  < d,(Xo  ♦ 1) 


(1.5.1) 


where  d^  > 1 is  the  smallest  number  to  be  determined  below. 
The  probability  P.  of  a correct  decision  is  given  by 
rl 

P}  P{X(.)  < d, (Aq  + ))} 


iJ!,  PX0tX<l)idl<\>*'» 


td1(X0»l)]  _x  xj 

7 Q ° ° 

• n TT 

j“0  J 


(1.5.2) 


Remark  1.5.1.  By  solving  for  the  smallest  value  dj  satisfying 

k 

> P* 


we  obtain  the  procedure. 


(d,(Ao+l)] 

E 

J-0 


xJ 

o 

IT 


w 


r 
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Case  (il):  X unknown. 
1 ■ ■ 1 o 


Let  Xq  be  an  observation  from  tt^.  Then  we  propose  the  following 
procedure. 

: Retain  in  the  selected  subset  those  and  only  those  populations  tt. 

for  which 


X.  < d,(X  + 1) 

I — L O 


(1.5.3) 


where  d^  >.  1 is  the  smallest  value  to  be  determined  below. 

Then  the  probability  P 2 of  a correct  decision  is  given  by 


P2  “ P^X(l)  -d2^Xo  + i = 1 rl^ 

« rl  [d2(x+,)]  -A  V,  -X  Xx 

- I ( n £ e [jl  -Ui  } e ° 

x=0  j=l  x.=0  xj 

J J 


[d2 (x+1 ) ] _x  xy  _x  Xx 

>Z{  £ e ° } e ° 


e — r- 

x=0  y=0  y 


[d2(x+l)] 


■X  Xy  -X  X*  r, 


( z 

y=0 

o 

e 

o x ° 

7 e 

4> 

x! 

[d2(x+l)] 

-X 

Xy  -X 

Xx  k 

( I 
y=0 

e ° 

4)  e ° 

yT 

4 } 

x! 

ld2(Xo  + 

D)}k 

1 Poisson  with  parameter  X . 

(1.5.*) 


{ £ P (X,  < d2(XQ  +1)  | XQ  ♦ X,  - r) 


r=0 


p(xo  ♦ X, 


r) } 


For  any  fixed  r >_  0,  let  d2(r)  be  the  smallest  number  such  that 

A(2,  r,  d2(r))>^  Jv/7* 


I 
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where  A(2,  r,  d^r))  is  defined  in  (1.2.3).  Let  d£  - sup^fr):  r _>  0}, 

we  have  > P . 

1 .6  Applications  to  a Test  of  Homogeneity 
for  A)  ■ A-  “ \ 

In  some  practical  situations  one  wishes  to  know  whether  A.  are 
significantly  different  or  not.  This  is  the  problem  of  the  test  of 
homogeneity  of  the  Poisson  populations.  In  order  to  test  the  homogeneity 
of  k experiments,  i .e.  to  test  H:  Aj  = ^2  = \ = ^o  a9a*nst  the 

H^:  not  H,  we  proposed  the  following  rules  <J>j  and  ^(T), 

(1)  The  procedure  $ , : H is  accepted  If,  and  only  if, 

X - c X . < c,  where  c is  some  constant  depend- 

max  min- 

ing on  k,  Aq,  and  the  level  of  significance  a. 

(2)  The  procedure  <J>,(T):  H is  accepted  if  and  only  if 

1 k 

X - c ( t ) X . < c(t),  given  that  T = Z X.  = t. 

max  min  — i=l 

For  the  procedure  , i f we  choose  c * sup{c(t):  t 0},  where  for 

any  t,  t > 0,  c(t)  is  the  smallest  constant  such  that 


A(k,  t,  c (t) ) _>  1 - £■  . 


then,  for  A^  e i.e.,  when  H is  true 

P.{  max  X.  - c min  X.  <_  c) 

- ]<j<k  J l<j<k  J 


■ 1 - P.{  max  X.  > c min  X.  + c) 
— l<j<k  J l<j<k  J 

k 

> I - Z PAX.  > c min  X.  + c) 

1-1  - 1<j<k  J 


f 
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■ 1 - k + E P,{X.  < c min  X.  + c} 

1 = 1 - 1 ~ l<j<k  J 

j/r 

k 00  k k 

- l“k  + Z Z P {X.  < c min  X.  + c | Z X.  - t>  P(  Z X.  - t) 

1 = 1 t“0  - 1 <j  <k  J 1 = 1 i«l  1 

k 00  k k 

> 1-k  + Z Z P {X,  < c(t)  min  X.  +c(t)|Z  X.-t}  P(  Z X.-t) 

1=1  t-0  - 1 “ l<j<k  J 1=1  ' I-I  ' 

J»»l 


> 1 - a 


by  Lemma  1.2.1.  Hence  P^  ^ [Reject  H]  <_  a. 

— o 

The  probability  of  the  error  of  the  first  kind  for  ^(T)  is  then 
given  by 


by  Lemma  1.2.1.  Hence,  for  given  significance  level  a,  we  can  find  c(t) 
such  that  E(0^(^)  |H,a)  <_  a • 


1 .7  Explanations  of  the  Tables 

(1)  Tables  I,  II  and  III  list  the  Infimum  of  the  probability  of  a 


correct  selection  (approximate  value)  for  the  rules  R 


I’ 


R and 
3 
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Rj  . R|  and  are  proposed  and  studied  in  this  paper  and  Rj* 
is  the  selection  procedure  for  selecting  a subset  to  include 
the  population  associated  with  A^j  discussed  in  Gupta  and 
Huang  [A3j-  If  should  be  pointed  out  that  the  probability 
of  a correct  selection  for  all  these  rules  is  decreasing 
with  when  A is  snail  and  then  it  is  increasing  again  with 

Hence,  trie  approximate  infimuni  can  be  determined  numerically 
by  computing  the  probability  as  a function  of  A,  for  fixed 
values  of  c.  For  given  k and  P:,  the  selection  constants 
(approximately)  can  be  found  from  these  tables.  For  example, 
for  P*  = . 350A,  and  k = 4,  the  approximate  value  of  c 
associated  with  Rj  is  Z.b. 

(2)  In  Tables  IVA,  IVB,  IVC  and  IVD,  the  first  entry  denotes  the 
probability  of  selecting  the  best  population,  the  second 
entry  denotes  the  probability  of  selecting  a non-best 
population  and  the  third  entry  is  the  expected  proportion, 
all  under  the  slippage  configuration  A^  = 6A,  6 < 1; 

A^j  = •••  = A^j  = A,  when  the  rule  Rj  is  used.  The  three 
entries  in  Table  VA,  VB,  VC,  VD  define  the  same  quantities 
for  the  rule  R^ . For  example,  from  Table  IVC,  we  find  that  A 
for  the  rule  R|  if  A = 2.50  and  c - 1.50,  (k=5  and  6=.3,  the 
probability  of  a correct  selection  is  .9A61,  the  probability 
of  selecting  a non-best  population  is  .1*879  and  the  expected 
proportion  of  populations  in  the  selected  subset  is  .5796. 
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I . 3 Some  Remarks  on  the  Comparison  of  R^  and  R^ 

We  define  a rule  R to  be  better  than  another  rule  R'  if  the 
expected  proportion  for  R is  smaller  than  the  expected  proportion  for 
R'.  We  compare  the  performance  of  the  rules  Rj  and  R^  in  the  aspect. 
For  example,  when  k = 5,  P*  = 0.92,  we  obtain  the  approximate  values  of 
selection  constants  for  Rj  and  R^  as  c^  = 3-0,  c^  = 1.55  from  Table  I 
and  Table  II  respectively.  For  these  constants  Tables  IV,  V show  that 
if  6 is  kept  fixed  R^  seems  to  be  better  than  Rj  when  a is  small,  while 
Rj  performs  better  than  R^  for  large  values  of  A. 
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Table  of  inf  P(CS  RT)  (Approximate)  Using  the  Rule 


parameter 


Table  IVA 


V) 


Using  the  rule  Rf  and  under  the  configuration  (6X,X X) , this 

table  gives  in  order  the  triple  (a)  the  probability  of  selecting  a best 
population,  (b)  the  probability  of  selecting  any  non-best  population  and 
(c)  the  expected  proportion  of  the  selected  populations 
( [ (a)+(k- 1 ) (b) ]/k) . 

k = 3,  !.  = 0.3 


>'  \ 

1 .5 

1.75 

2.0 

2.5 

3.0 

3.5 

4.0 

4.5 

5.0 

0.50 

0.9913 

0.9913 

0.9995 
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0.9128 
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0.9622 

0.9624 
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r)3  '0.9CQ4  10.9998 


0.6S  .3  ,0.  76 14  0.812^  0.8655 


5.00 


Table  IVD 


Using  the  rule  Rj  and  under  the  configuration  (6A , A , . . . , A) , this 
table  gives  in  order  the  triple  (a)  the  probability  of  selecting  a best 
population,  (b)  the  probability  of  selecting  any  non-best  population 
and  (c)  the  expected  proportion  of  the  selected  populations 
( [ (a )+  (k- I ) (b ) ]/k) . 
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IVB  (continued) 

k = 3,  <$  = 0.5 
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Table  IVC  (continued) 


k = 5.  <$  - 0. 3 
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Table  IVD 

Using  the  rule  Rj  and  under  the  configuration  (SA,A A),  this 

table  gives  in  order  the  triple  (a)  the  probability  of  selecting  a best 
population,  (b)  the  probability  of  selecting  any  non-best  population  and 
(c)  the  expected  proportion  of  the  selected  populations 
( [ (a)  + (k- 1 ) (b) ]/k) . 


k = 5,  6 = 0.5 
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0.9860 

0.9860 

BBpl 

0.9973 

0.9995 

1.50 

0.8873 

0.8879 

0.9739 

0.9792 

0.9953 

0.9953 

0.9993 

0.9999 

0.6516 

0.6519 

0.8518 

0.8551 

0.9503 

0.9505 

O.986G 

0.9860 

0.9966 

0.6988 

0.6989 

0.8762 

CA 

CO 

CO 

0 

0.9593 

0.9595 

0.9886 

0.9886 

0.9973 

2.00 

1 

1 0.0798 

0.8751 

0.9629 

0.9693 

0.9915 

0.9916 

0.9983 

0.9983 

0.9997 

I 0.5959 

0.5973 

0.7899 

0.8029 

0.9139 

0.9151 

0.9687 

0.9688 

0.9902 

0.6513 

0.6529 

0.8295 

0.8351 

0.9299 

0.9309 

0.97^7 

0.97^7 

0.9921 

2.50 

0.8776 

0.8792 

0.9576 

0.9610 

0.9887 

0.9888 

0.9973 

0.9973 

0.999** 

0.5651 

0.5730 

0.7978 

0.7772 

0.3853 

0.8895 

0.9507 

0.951 1 

O.98II 

0.6276 

0.6392 

0.7898 

0.8190 

0.9099 

0.9600 

0.9603 

O.98A8 

Table  VA 
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Using  the  rule  and  under  the  configuration  (6A , A , . . . ,A ) , this 
table  gives  in  order  the  triple  (a)  the  probability  of  selecting  a best 
population,  (b)  the  probability  of  selecting  any  non-best  population 
and  (c)  the  expected  proportion  of  the  selected  populations 
([(a)+(k-l)(b)]/k). 

k = 3,  S = 0.3 


1.50 

1.75 

2.00 

2.50 

3.00 

3.50 

4.00 

4.50 

5.00 

.50 

.9960 

.9962 

.9993 

.9998 

.9999 

.9999 

.9999 

.9999 

.9999 

.9^77 

.9521 

.9918 

.9924 

.9990 

.9990 

.9999 

.9999 

.9999 

.9638 

.9668 

.9945 

.9948 

.9993 

.9993 

.9999 

.9999 

• 9999 

.75 

-99^5 

.9950 

.9996 

.9996 

.9999 

.9999 

.9999 

.9999 

.9999 

.9179 

.9305 

.9817 

.9843 

.9971 

.9972 

.9995 

.9995 

.9999 

• 9434 

• 9520 

.9877 

.9394 

.9981 

.9981 

.9997 

• 9997 

.9999 

1.00 

• 9939 

- 99^8 

.9994 

.9995 

.9999 

.9999 

.9999 

.9999 

.9999 

.8931 

• 9165 

.9701 

.9766 

.9945 

.9947 

.9989 

.9990 

.9998 

.9267 

.9^26 

.9799 

.9842 

.9963 

• 9965 

• 9993 

.9993 

.9998 

1 .50 

.9940 

• 9957 

.9992 

.9994 

.9999 

.9999 

• 9999 

.9999 

.9999 

.8530 

.8971* 

.9461 

.9662 

.9891 

.9903 

• 9973 

.9973 

.9993 

.9000 

• 9302 

.9638 

.9773 

.9927 

.9935 

.9981 

.9982 

.9995 

2.00 

.9948 

.9969 

.9991 

.9995 

.9999 

.9999 

.9999 

.9999 

.9999 

.8176 

.8782 

• 9227 

.9605 

.9848 

.9881 

.9958 

.9960 

.9987 

.8766 

.9178 

.9482 

.9735 

.9898 

.9921 

.9972 

.9973 

.9991 

2.50 

.9956 

.9977 

.9992, 

9996 

.9999 

.9999 

• 9999 

.9999 

.9999 

.7855 

.8583 

.9010 

.9559 

.9814 

I 

.9876 

.9949 

.9955i 

.9983 

.8555 

.9043 

.9337 

.9705 

.9875; 

.9917 

.9966 

.9970' 

.9988 

Us i ruj  the  rule  R^  and  under  the  configuration  (6X, A, . . . ,A)  , this 
table  gives  in  order  the  triple  (a)  the  probability  of  selecting  a best 
population,  (b)  the  probability  of  selecting  any  non-best  population 
and  (c)  the  expected  proportion  of  the  selected  populations 
( [ (a)  + (k- 1 ) (b) ]/k) . 


k - 3,  f>  = 0.5 


\c3 

A v\ 

1 .50 

1.75 

2.00 

2.50 

3.00 

3.50 

4.00 

! 4.50 

5.00 

0.50 

.9894 

.9901 

.9991 

.9992 

.9999 

• 9999 

.9999 

• 9999 

.9999 

.9520 

.9565 

.9925 

.9931 

.9991 

! .9991 

.9999 

• 9999 

.9999 

.9644 

.9677 

.9947 

.9951 

j .9994 

1 

.9994 

.9999 

.9999 

.9999 

0.75 

• 9852 

.9873 

.9982 

.9984 

.9998 

1 .9998 

.9999 

• 9999 

.9999 

.9263 

• 9394 

.9838 

.9864 

| .9975 

.9976 

.9996 

1 

.9996 

.9999 

.9463 

• 955A 

.9886 

.9904 

.9983 

.9933 

.9997 

.9997 

.9999 

1 .00 

.9830 

.9868 

.9975 

.9980 

• 9997 

.9997 

.9999 

.9999 

.9999 

.9068 

.9294 

.9743 

.9806 

.9955 

.9957 

.9991 

.9991 

. 9998 

.9322 

.9485 

.9820 

.9864 

.9969 

.9970 

.9994 

.9994 

.9999 

1.50 

.9818 

.9883 

.9964 

.9977 

.9996 

.9996 

.9999 

.9999 

.9999 

.8750 

.9160 

.9554 

.9740 

.9917 

.9928 

.9979 

.9980 

.9995 

.9106 

.9401 

.9691 

.9819 

.9943 

.9950 

.9986 

.9986 

.9996 

2.00 

.9819 

.9901 

.9958 

.9981 

.9995 

• 9996 

.9999 

.9999 

.9999 

i 

.8472 

.9019 

• 9377 

.9711 

.9891 

.9919 

.9971 

.9973 

.9991 

.8921 

.9313 

.9571 

.9801 

.9926 

.9944 

.9980 

.9981 

.9994 

2.50 

.9825 

.9914 

.9956 

.9985 

.9996 

.9997 

.9999 

.9999 

.9999 

.8235 

.8884 

.9274 

.9687 

.9873 

.9920 

.9968 

.9972 

.9989 

.8/65 

.972/ 

.9468 

.9/87 

.9914 

.9946 

.99/8 

.9981 

.999  3 

Table  VC 
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Using  the  rule  and  under  the  configuration  (6A,A, . . . . ,X) , this 
table  gives  in  order  the  triple  (a)  the  probability  of  selecting  a best 
population,  (b)  the  probability  of  selecting  any  non-best  population  and 
(c)  the  expected  proportion  of  the  selected  populations 
( [ (a ) + (k- 1 ) (b) ]/k) . 

k = 5,  3 = 0.3 


1.50 

1 .75 

2.00 

2.50 

3.00 

3.50 

4.00 

4.50 

5.00 

0.50 

.9982 

• 9983 

.9996 

.9999 

.9999 

.9999 

.9999 

.9999 

1 .0000 

.9721 

.9739 

.9886 

.9961 

.9995 

.9996 

.9999 

.9999 

.9999 

.9773 

.9788 

.9908 

.9969 

.9996 

.9997 

.9999 

.9999 

.9999 

0.75 

.9979 

.9932 

.9992 

.9998 

.9999 

.9999 

.9999 

.9999 

.9999 

.9562 

.9635 

.9759 

.9929 

.9988 

.9993 

.9998 

.9999 

.9999 

.9646 

.9704 

• 9805 

.9943 

.9990 

.9994 

.9998 

.9999 

.9999 

1 .00 

.9977 

.9983 

.9990 

.9998 

.9999 

.9999 

.9999 

.9999 

.9999 

• 9394 

.9540 

.9655 

.9899 

.9979 

.9990 

.9996 

.9998 

.9999 

.9510 

.9629 

.9722 

.9919 

.9983 

.9992 

.9997 

.9998 

.9999 

1 .50 

.9976 

.9986 

.9991 

.9998 

.9999 

.9999 

.9999 

.9999 

.9999 

.9097 

•9359 

.9561 

.9861 

.9962 

.9987 

.9991 

.9997 

.9999 

.9272 

.9485 

.9647 

.9889 

.9970 

.9990 

.9993 

.9997 

.9999 

2.00 

.9979 

.9989 

.9994 

.9999 

.9999 

.9999 

.9999 

.9999 

.9999 

.8871 

.9218 

.9526 

.9849 

.9953 

.9985 

.9989 

.9996 

.9999 

.9092 

.9372 

.9620 

.9879 

.9962 

.9988 

.9991 

.9997 

.9999 

2.50 

.9983 

.9992 

.9996 

.9999 

.9999 

.9999 

.9999J 

.9999 

.9999 

i 

.8704| 

.9150! 

.9505 

.98451 

.9950 

.9983 

.9989 

.9996 

.9999 

3.00  .9987 

.8597 
.8875 


3.50  .9990 


.8539 
.8829 
4.00  .9993 


.8805 

5.00  .9997 
.8462 
.8769 

6.00  .9998 
.8437 
.8750 

8.00  .9999 
.8446 
.8757 


10.00  .9999 


.9999  .9999  .9999 

.9953  .9983  .9991  I .9997 


.9599 
• 9998 


.9604 

.9999 

.9519 

.9615 


.9555 

.9644 


.9962  .9986 

.9999  .9999 


.9999 


.9957  .9984  .9993  .9998 

.9966  .9987  .9994  .9998 

.9999  .9999  .9999  .999911.0000 

.9963  .9987  .9994  .9998 

.9970  .9989  .9995  .9999 

.9999  .9999  .9999  1.000011.0000 


.9973  .9992  .9997 

.9979  .9993  .9997 


.9999 


.9677 


.9999  1.0000  1.0000  1.000011.0000 

.9981  .9995  .9998  .9999 

.9985  .9996  .9998  .9999 


.9999  .9999  1 .0000  1 .0000  1 .0000  1 .00001 1 .0000 


.9676 


.9991  .9998 


.9999 


.9993  .9998  .5999  .9999 


15.00  .999911 

.8624 


.9999  1.0000  1.0000  1.0000  1.0000  I .0000! 1 .0000 

.9744  .9967  .9996  .9999  .9999 


.9999  .9999  .9999 


0000! 1 .0000  1 .0000  1.0000  1 .0000 


.9999  .9999  .9999  .9999 

.9999  .9999  .9999 
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Table  VD 

Using  the  rule  and  under  the  configuration  (6X,A X) , this 

table  gives  in  order  the  triple  (a)  the  probability  of  selecting  a best 
population,  (b)  the  probability  of  selecting  any  non-best  population  and 
(c)  the  expected  proportion  of  the  selected  population 
([(a)+(k-l)(b)]/k). 


k - 5.  6 = 0.5 


\c. 

1.50 

1.75 

2.00 

2.50 

3.00 

3.50 

4.00 

4.50 

5.00 

0.50 

.9948 

.9952 

.9985 

.9996 

.9999 

• 9999 

.9999 

.9999 

.9999 

.9737 

.9756 

.9890 

• 9964 

• 9995 

.9996 

• 9999 

.9999 

.9999 

.9779 

.9795 

.9909 

• 9971 

.9996 

.9997 

.9999 

.9999 

.9999 

0.75 

.9930 

.9944 

.9970 

• 9994 

.9999 

.9999 

.9999 

.9999 

• 9999 

.9539 

.9663 

.9773 

.9935 

.9989 

.9994 

.9998 

.9999 

.9999 

.9657 

.9720 

.9813 

.9947 

.9991 

.9995 

.9998 

.9999 

.9999 

1 .00 

• 9915 

.9941 

.9961 

.9993 

.9999 

..9999 

.9999 

.9999 

.9999 

.9^35 

.9580 

.9686 

.9910 

.9982 

.9992 

• 9996 

• 9998 

.9999 

.9531 

.9652 

.9741 

.9927 

.9985 

.9993 

.9997 

.9998 

.9999 

1 .50 

.9898 

.9940 

• 9963 

.9993 

.9999 

.9999 

.9999 

.9999 

.9999 

.9175 

.9424 

.9617 

.9883 

.9968 

.9990 

.9993 

.9997 

• 9999 

.9320 

.9527 

.9687 

.9905 

.9974 

.9992 

• 9994 

.9998 

.9999 

2.00 

.9898 

.9943 

.9972 

.9995 

.9999 

.9999 

.9999 

.9999 

.9999 

.8985 

.9313 

.9596 

.9876 

.9962 

.9988 

.9991 

.9997 

.9999 

.9168 

.9439 

.9672 

.9900 

.9970 

.9990 

.9993 

.9998 

.9999 

2.50 

.9905 

.9952 

.9979 

.9996 

.9999 

.9999 

.9999 

.9999 

.9999 

.8851* 

.9274 

.9587 

.9876 

.9962 

.9987 

.9992 

.9997 

.9999 

.9061* 

.9410 

.9665 

• 9969 

.9990 

.9994 

.9998 

.9999 

75 


3.00 

.9917 

.9963 

.9985 

.9997 

.9999 

.9999 

• 9999 

.8781 

.9284 

.9592 

.9881 

.9965 

.9988 

.9994 

.9009 

.9420 

.9671 

.9904 

.9972 

.9990 

• 9995 

3.50 

.9931 

.9973 

.9989 

.9998 

.9999 

.9999 

.9999 

.8752 

.9307 

.9606 

.9891 

.9970 

.9989 

.9995 

.8988 

.9440 

.9683 

.9912 

.9976 

.9991 

.9996 

*♦.00 

• 9945 

.9980 

.9992 

.9999 

.9999 

.9999 

• 9999 

.8740 

.9326 

• 9625 

.9903 

.9975 

.9991 

.9996 

.8981 

• 9457 

.9698 

.9922 

.9980 

.9993 

.9997 

5.00 

.9964 

.9989 

.9996 

.9999 

.9999 

.9999 

-9999 

.8726 

.9350 

.9666 

.9927 

.9983 

.9995 

.9998 

.8974 

.9478 

.9732 

.9941 

.9986 

.9996 

.9998 

6.00 

.9976 

.9993 

.9998 

.9999 

9999 

.9999 

.9999 

r*~\ 

cc 

• 

.9382 

.9708 

.9944 

j .9989 

.9997 

.9999 

.8982 

.9504 

.9766 

.9955 

.9991 

• 9998 

• 9999 

8.00 

.9990 

• 9998 

.9999 

.9999 

.9999 

.9999 

1.0000 

.8787 

.9470 

.9783 

i .9969 

.9995 

.9999 

.9999 

.9028 

.9576 

.9826 

.9975 

.9996 

.9999 

.9999 

10.00 

.9996 

• 9999 

.9999 

• 9999 

.9999 

1 .0000 

1 .0000 

.8858 

.9560 

.9840 

.9983 

.9998 

.9999 

.9999 

.9086 

.9648 

.9872 

.9986 

.9998 

.9999 

.9999 

15.00 

.9999 

.9999 

.9999 

1 .0000 

1 .0000 

1 .0000 

1 .0000 

.9046 

.9720 

.9927 

.9996 

.9999 

.9999 

.9999 

.9237 

.9776 

. 994 1 

.9997 

.9999 

.9999 

.9999 

.9999 

.9998 

.9995 

.9999 

• 9998 
.9999 
.9999 

• 9999 
.9999 

• 9999 
.9999 
.9999 
.0000 
.9999 
.9999 
.0000 
.9999 
.9999 
.0000 
.9999 
.9999 
.0000 
.0000 
.0000 


1 .0000 
] .0000 
I .0000 
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CHAPTER  I I 

SOME  RESULTS  ON  SUBSET  SELECTION 
PROCEDURES  FOR  DOUBLE  EXPONENTIAL  POPULATIONS 

2 . 1 I ntroduct i on 

In  this  chapter  we  study  the  selection  problems  and  some  other 
related  statistical  inference  problems  for  the  k double  exponential 
(Laplace)  populations.  Before  we  do  this,  we  give  some  discussion  of 
the  Laplace  distribution,  its  characteristics  (vs.  normal,  logistic  and 
Cauchy)  and  its  use  as  a model  in  statistics  and  probability. 

The  double  exponential  distribution  arises  as  a model  in  some 
statistical  problems  as  explained  later.  This  distribution  is  also 
considered  in  robustness  studies,  which  suggests  that  it  provides  a 
model  with  different  cha racter i s t i cs  than  some  of  the  other  commonly 
used  models  such  as  the  normal  distribution.  In  particular,  the  tails 
of  the  double  exponential  distribution  are  thicker  than  the  tails  of 
the  normal  or  logistic,  but  not  as  thick  as  the  Cauchy  (see  p.  A3, 

Hajek  [A7l)«  Yet  the  double  exponential  has  not  been  used  very  ex- 
tensively as  a model.  This  could  be  due  in  part  to  the  lack  of  available 
statistical  techniques  for  this  distribution,  although  it  is  likely  that 
the  experimentor  has  shied  away  from  using  the  double  exponential  be- 
cause it  has  a sharp  peak  in  the  center.  However,  many  applications 
would  be  primarily  concerned  with  tail  probabilities,  and  it  would  seem 
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that  the  double  exponential  would  be  a useful  model  if  exponential  tails 
are  requi red . 

The  double  exponential  has  some  application  as  a model  in  the  area 
of  Actuarial  Science,  and  it  has  been  suggested  as  a model  for  the 
distribution  of  the  strength  of  flaws  in  materials  by  Epstein  [27]. 

Using  the  weakest  link  principle,  the  strength  of  the  material  should 
decrease  as  the  number  of  flaws  or  volume  increases.  In  particular, 
from  ext reme-va 1 ue  theory  the  double  exponential  assumption  leads  to 
the  result  that  the  mode  or  most  probable  strength  decreases  in  pro- 
portion to  log  n,  where  n represents  the  size  or  number  of  flaws  of  the 

material.  In  comparison,  the  assumption  of  a normal  model  leads  to  a 

1 /2 

decrease  in  proportion  to  (log  n)  . For  most  applications  to  material 
strenqth,  only  the  minimum  flaw  strength  would  ordinarily  be  observable; 
however,  Epstein  [27]  suggests  that  there  may  be  many  other  types  of 
problems,  such  as  a system  of  components  in  series,  which  might  be 
similar  from  a statistical  point  of  view.  Other  possible  applications 
of  the  double  exponential  are  suggested  by  the  fact  that  the  difference 
of  two  independent  (not  necessary  identical)  two  parameter  exponential 
variables  follows  the  double  exponential  distribution,  and  that  the 
logarithm  of  the  ratios  of  uniform  or  Pareto  variables  follows  the 
double  exponential  distribution. 

In  classical  theory,  once  having  assumed  the  form  of  the  parent 
distribution,  we  can  derive  a criterion  which  is  appropriate  to  this 
assumption.  For  example,  under  the  assumption  of  normality,  for  the 
comparison  of  two  means  we  would  derive  the  t-statistic.  It  is  then 
customary  to  justify  the  use  of  such  a normal  theory  criterion  in  the 
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practical  circumstance  in  which  normality  cannot  be  guaranteed  by 
arguing  that  the  distribution  of  the  characteristic  is  but  little 
affected  by  non-normality  of  the  parent  distribution  - that  is,  it  is 
robust  under  non-normality.  However,  this  argument  ignores  the  fact 
that  if  the  parent  distribution  really  differed  from  the  normal,  the 
appropriate  criterion  would  no  longer  be  the  normal -theory  statistic. 

Box  and  Tiao  [19]  reconsidered  the  analysis  of  Darwin's  paired  data 
on  the  heights  of  self  and  cross-fertilized  plants  quoted  by  Fisher 
in  "The  Design  of  Experiments  (1935)".  In  this  development  the  parent 
distribution  is  not  assumed  to  be  normal,  but  only  a member  of  the 
following  class  of  symmetric  distributions 

i i 2/0+6) 

p(y|e,o,e)  = j exp  {-  J |^—|  } (2.1.1) 

. l+kl+6)  2 C 

r[l+i(HB)]2  a 

where  - <»  < y < °°  , 0<o<co,  - <»  < 0 < <»,  - 1 <6<_1.  This  class  of 
distributions  includes  the  normal  (6=0)  and  the  double  exponential  (6=1), 
and  its  kurtosis  parameter  is  6. 

If  the  probability  density  function  of  the  double  exponential  is 
g i ven  by 

f(x,B,a)  = 53-  e ° , — °°  < x < -°°  < 0 < °°,  a > 0 

(2.1.2) 

then  the  mode  of  the  distribution  is  x = 0 where  it  has  a sharp  peak. 

The  expected  value  and  standard  deviation  of  (2.1.2)  are  0 and  JT  a 
respectively.  Moments  of  the  standardized  double  exponential  order 
statistics  can  be  obtained  by  using  the  closed-form  expressions  for  the 
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moments  of  the  standardized  negative  exponential  order  statistics 


derived  by  Epstein  and  Sobel  [28].  Govindarajul u [3A]  has  given 


the  expressions  for  these  moments. 


Chew  [23]  gives  the  graphs  of  the  standardized  density  functions 


of  normal,  logistic  and  double  exponential  distributions,  from  which 


it  is  clear  that  the  tails  of  the  double  exponential  distribution  are 


thicker  than  that  of  the  normal  or  logistic,  in  the  sense  that  the 


curve  of  double  exponential  is  above  that  of  the  others  to  the  left 


and  right  of  some  points.  In  the  case  of  the  normal  distribution  this 


point  is  2.6A. 


If  the  cumulative  distribution  functions  G.  (x) 


1 2 

" 7 u . 

e du 


1 /2x 


and  G2(x)  = { 


, x < 0 


1 -/2x 


of  the  standardized  normal  and  double 


, x > 0 


exponential  distributions  are  compared,  (also  similar  comparison  between 


standardized  logistic  G^(x)  = 1/(1  + e ) and  the  double  exponential 


distribution)  the  differences  G2(x)  - G^  (x)  (as  well  as  G0(x)  - G^(x)) 


vary  in  the  way  shown  in  the  graph  below.  Since  G^(x),  G2(x)  and 


G^(x)  are  symmetric  about  x = 0 only  the  values  for  x ^ 0 are  shown. 


With  regard  to  point  estimation,  it  is  well  known  that  the  maximum 


ikelihood  estimates  based  on  the  complete  sample  of  size  n are  given 


A A 1 | I • 

by  G = X and  a = — E |X.  - X|,  where  X denotes  the  sample  median. 


Also  best  linear  estimators  (based  on  order  statistics)  under  symmetric 


censoring  are  given  by  Govindaraj ul u [35I  for  sample  sizes  up  to  20, 


and  some  alternate  estimates  are  suggested  by  Raghunandanan  and 


Srinivasan  [66].  Interval  estimation  for  the  parameters  of  the 


two-parameter  double  exponential  distribution  is  considered  by  Bain 
and  Enge 1 hardt  [ k ] . 

Now  we  discuss  the  problem  of  comparison  of  k(>  2)  double 
exponential  distributions.  First  we  study  the  selection  problem  for 
the  largest  mean  (location). 

2.2  Selecting  a Subset  Containing  the  Best  of  Several  Double 
Exponential  Populations  with  Respect  to  the  Location  Parameter 

(A)  Formulation  of  the  Problem 

Let  X.,  i = 1,2 k be  k independent  random  variables  from 

double  exponential  population  it  , i = l,2,...,k  respectively,  with 
probability  density  function 

f (X;  0.  ,o)  = exp  [- 1 X-9 . |/o] , - <»  < x < «=,  - » < g < »,  a > 0 

where  a is  a common,  known  constant  for  each  of  the  k populations.  We 
may,  without  loss  of  generality,  assume  a to  be  one.  The  ranked  para- 
meters are  denoted  by  0^j  £ ®[2]  — **’  — ®[k]’  before,  's 
assumed  that  there  i s no  a priori  information  available  about  the 
correct  pairing  of  the  ordered  O^.j  and  the  k given  populations  from 
which  observations  are  taken.  Any  population  whose  parameter  value 
equals  0^j  will  be  defined  as  a best  population.  A correct  selection 
(CS)  is  defined  as  the  selection  of  any  subset  of  the  k given  popula- 
tions which  contains  at  least  one  best  population. 

Suppose  we  take  (2n+l ) independent  observations  from  tt.  , 
i = l,2,...,k;  the  sample  size  (2n+l ) is  assumed  to  be  given  in  the 
primary  problem  below.  Let  P*(^<  P*  < I ) be  a preassigned  constant. 


^-1 
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Let  P ( C S ; k,  n,  £,  R)  denote  the  probability  of  a correct  selection 
when  the  procedure  R is  used  with  the  given  k,  n and  when  the  true 
configuration  of  parameter  values  is  £ = (0j , let  the  space 

of  all  possible  values  of  £ be  denoted  by  ft. 

The  problem  of  primary  interest  is  to  define  a procedure  R which 
selects  a subset  of  the  k given  populations  that  is  small,  never 
empty,  and  large  enough  so  that  it  contains  the  best  population  with 
probability  at  best  P*,  regardless  of  the  true  configurations  0 in 
ft,  i .e . , so  that 


inf  P(CS;  k,  n,  £,  R)  £ P* 

ft 


(2.2.1) 


After  having  defined  a particular  procedure  R = R(k,  n,  P*)  for  each 
possible  set  of  values  of  k,  n and  P*,  we  discuss  the  expected  size 
E { S ; k,  n,  0,  P*,  R}  of  the  selected  subset  when  the  procedure  R is 
used  with  the  given  k,  n,  P*  and  where  £ is  the  true  parameter  con- 


figuration i n ft. 

Let  Y.  denote  the  sample  median  of  the  (2n+l ) observations 

X.j,...,X.  2n+l  ’ frorl  the  * th  population,  and  let  Y^  denote  that 

unknown  variable  which  is  associated  with  0^.  The  probability  density 

q (•)  and  the  cumulative  distribution  G (•)  of  Y.  are  given  by 
3n  n i 

, . . (2n+l ) ! ,1  -|y-9|I  „ I -|y*ei I"  „ , 

9n(Y;  6i)  * nVnT"  (2  e > 0 ' T e > (2-2-2> 

n , y-6.  j , y-9.  2n+l-j 


n , , y-9.  j . y-9: 

1 - S (2?+,)(ye  ')  (1  - je  ') 


Gn(y,  6.)  = ( 


(y-9:)  j , -(y-9;)  2n+i-j 


" ,2n+l  i /l  i . J /.  1 w i 

£ ( j ) (^  e ) O-je 


(2.2.3) 
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Now,  we  propose  the  selection  procedure  R,.  as  follows: 

R^:  Retain  in  the  selected  subset  only  those  populations  tt. 

for  which 


Y > max  Y - d (2.2.14) 

l<j<k  J 

where  d = d(k,  n,  P*)  is  the  smallest  ndn-negative  constant  to  be 
determined  that  will  satisfy  the  basic  probability  requirement  (2.2.1) 
for  all  configurations  0_  = (0 ^ , &2,...,0k). 


(B)  Probability  of  a Correct  Selection  and  Its  Infimum 


The  following  result  concerning  the  rule  Rj.  can  be  proved. 

CO 

Theorem  2.2.1 . inf  P (CS | R ) = inf  Pfl(CS|R  ) = / Gk"' (y+d)g  (y)dy 


9eft  - 


'r  / ~~  • i • • ■ n ' 

5 0eft  i 

— O 


where  ftQ  = {£  = (9^ ,. . . ,0^)  : 0^  = 02  = ...  = 0k  = 0},  Gn(y) , 9n(y)  are 
the  cdf  and  pdf  of  the  sample  median  of  (2n+l ) observations  from  the 
standard  double  exponential  distribution. 

Proof.  For  9 e ft, 


vcsiy  - vY(k)  - Yo)  - d} 

= VY(k)"8[k]  - Y(jfe[j] + e[j]“e[k]'d*  j=1*2* 
= / 


k-1  y+0 r.  -.-Or.T+d 

n J [kl  [jl  g„(z)  dz 
j = l -00 


gn(y)  dy 


. ,k-l  } 

(2.2.5) 


Note  that  0^kj  - 0jjj  _>  0 for  j = l,...,k-l;  thus  the  result  follows. 
Hence,  if  we  choose  d to  be  the  smallest  constant  to  satisfy 


Gk  ' (y+d)  gn  (y)  dy  «=  P 


(2.2.6) 


60 


r 


then  we  have  determined  the  constant  d for  which 


inf  PA ( C S | R_ ) = P*  . (2.2.7) 

Qcfi  i 5 


(c)  Some  Properties  of  R,- 

For  £ e f and  £ = *****®[k]^  c*e^*ne  P0  ) = pq  select 

population  and  recall  the  following  definitions  (see  Santner 

(691). 

Def i n i t ion  2.2.1.  The  rule  R is  strongly  monotone  in  means 


i n 0 
i n 0 


[i] 

[j] 


when  all  other  components  of  £ are  fixed 
(jl*i)  when  all  other  components  of  £ are  fixed 


Definition  2.2.2.  R is  a monotone  procedure  means  for  every  £ e ft 
and  1 £ i < j £ k,  pg(>)  £ pg(j). 

Definition  2.2.3.  R is  an  unbiased  procedure  means  for  every  £ e fi 
and  1 < j < k, 


Pq{R  does  not  select  £ PQ{R  does  not  select 

Of  course,  if  R is  monotone  it  is  also  unbiased. 

Theorem  2.2.2.  For  any  i » 1,2,...,k,  the  procedure  R5  is  strongly 
monotone  in  11  ( j ) • 

Proof.  The  proof  follows  easily  from  the  expression 

°o  k 

Pe(i)  - J < n Gn(y  + 0j,j  - Qjjj  + d) } gn(y)  dy  . 
j*l 

Corol lary  2.2.1.  The  rule  R,.  is  monotone  and  unbiased. 

Proof.  It  is  known  and  easy  to  see  that  if  R is  strongly  monotone  in 
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for  all  i =»  1,2 k,  then  It  Is  monotone. 

Now  we  consider  some  special  configurations  of  0 e ft. 


9m 


m 


9[i] 

Under  (2.2.8), 


P0(D  - / 


P0(k) 


6 , I - 1,2 k-l 

0 + A , A > 0 

0 + (i-l)A  , A > 0,  i - 1,2 

[Gn(y+d)]k  2 Gn(y+d-A)  9n(y)  dy 

00 

/ [Gn(y+d+A)]k  1 9n(y)  dy  . 

-oo 


k. 


for 


(2.2.8) 


(2.2.9) 


(2.2.10) 

(2.2.11) 


While  under  (2.2.9). 

co  k 

Pq ( i ) = / { n Gn(y+d+(i-j)A)}  gR(y)  dy,  i=l ,2 k. 

— -CO  j“) 

From  the  above  equations  we  can  make  the  following  remarks: 

Remark  2.2.1.  For  fixed  P*,  k,  n,  1 (i  = 1 ,2, . . . ,k-l ) , the  probability 
of  selecting  population  IT^  decreases  from  P*  to  zero  as  A increases 
from  zero  to  infinity. 

Remark  2.2.2.  For  fixed  P*,  k and  n,  the  probability  of  selecting  n ^ 
increases  from  P*  to  one  as  A Increases  from  zero  to  infinity. 

Remark  2.2.3.  For  fixed  P*,  k,  i (i  = l k-l)  and  A,  the  probability 

of  selecting  population  tends  to  zero  as  n -*■  While  the 

probability  of  selecting  tends  to  one  as  n + “. 

Conclusion : Under  either  configuration  (2.2.8),  (2.2.9), 
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k 

EQ(S|R  ) = £ PQ(i)  ■*"  1 as  & ■*  “ for  fixed  n and  Eq(S|Rc)  -►  1 as 

£ j_|  £ £ 5 

n -*■  00  for  fixed  A. 

(D)  Asymptotic  Results  for  the  Procedure  R.. 

It  suffices  to  consider  the  parameter  space  For  n large, 

we  discuss  an  asymptotic  property  of  the  procedure  as  follows.  Let  v 

be  the  sample  median  from  a sample  of  size  (2n+l ) with  pdf 

f(x;0)  = j e - a>  < x < “.  Then  it  is  known  (see  Chu  [2^4  ] ) that 

under  , — — is  asymptotically  normally  distributed  (here  On  = p) . 

° n 

Let  Z denote  a random  variable  which  has  a standard  normal  distribution, 

Y_0 

then  - is  asymptotically  distributed  as  Z.  Hence,  under  the 
n 

probab i 1 i ty 

Y.  > max  Y.  - d 

k ~ l<j<k  J 

is  asymptotically,  the  same  as  the  probability  that 

Z.  > max  Z.  - /2n+l  d (2.2.13) 

k l<j<k  J 

where  Z . , i = l,2,...,k,  are  iid  standard  normal  variables.  Hence, 
inf  P0(CS|Rc)  5 P0{Z,  > max  Z.  - /2^+T)  d 

eeno  i 5 £ k " l<j<k  J 


where  $(.)  is  the  cdf  of  the  standard  normal  distribution. 

(E)  The  Monotone  Llkel ihood  Ratio  Property  of  the  Sample  Median 

Suppose  Y is  the  sample  median  of  (2n+l ) observations  from  the 

1 -lx 

population  with  double  exponential  density  function  f(x;0)  = 
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The  pdf  gn(y;0)  and  cdf  Gn  (y ;6)  of  Y are  given  by  equations  (2.2.2) 
and  (2.2.3). 

After  some  algebraic  computations,  we  see  that  Gn(9;9)  ■=  ; also 

it  is  easy  to  show  that  gn(y;9)  is  differentiable  at  y = 6. 

Let  gn(y;9)  = gn(y-9).  It  is  shown  in  Lehmann  [53,  p.330]  that  a 
necessary  and  sufficient  condition  for  g^(y-0)  to  have  monotone  likeli- 
hood ratio  in  y is  that  -log  g^  is  convex.  Our  main  goal  in  this 
section  is  to  prove  this  assertion.  Now 


— 1 \ #1  - |y  Ln+I  /,  1 -lykn  (2n+l)l 

9_(y)  = e ) (1  - -r  e y 1 ) where  c = — • s°» 

n n 1 1 n n!n! 

" log  9n(y)  = " log  cn  + (n+1)  log  2 + (n+l)|y|-n  log  (1~  e ^). 
Let  h (y)  = (n+1  )|y|  - n log  (1  - e ^Y^)  = ( h^(y)  ,y<0  which  is 


h2(y)  . y > 0 


a continuous  function.  For  y < 0, 

h(y)  * hj(y)  = -(n+l)y  - n log  (1  - j ey),  we  have 

f .y  T«’ 

h.  (y)  = - (n+1)  + j < 0 since  for  y < 0,  : < 1 

1 - j ey  1 - j ey 

H eV 

and  h~(y)  = — - — « — > 0 . 

' (1  " T eY)2 


Hence,  for  y<0,  hj  (y)  is  a decreasing,  convex  function.  Similarly, 
for  y > 0, 


L -y 


h(y)  = h2(h)  = (n+1)  y - n log  (I  - y e Y) 

— e~y 

2 c 

h£ (y ) * n+1  - ■ ■ ■— | — — > 0 since  for  y ^ 0,  — — ^ — — < 1 
1 ^ e 1 — ^ e 


2 6 


h~(y) 


n e-y 

2 6 


(1 


> 0 • 


6<« 


Hence,  for  y ^ 0,  h^Cy)  is  an  increasing,  convex  function.  Note  that 
h(y)  Is  continuous  at  y = 0,  decreasing,  convex  for  y < 0 and  increas- 
ing, convex  for  y ^ 0.  Hence,  this  concludes  that  h(y)  is  a convex 

function,  which  implies  - log  g (y)  is  also  a convex  function. 

n 

Theorem  2.2.3.  9n(y;0)  has  monotone  likelihood  ratio  in  y. 


(F)  Expected  Size  of  the  Selected  Subset 

The  procedure  R satisfies  the  basic  probability  requirement 
(2.2.1)  and  is  defined  by  (2 . 2 . 4 ) . Consistent  with  the  basic  probability 
requirement,  we  would  like  the  size  of  the  selected  subset  to  be  small. 
Now  S,  the  size  of  the  selected  subset  is  a random  variable  which  takes 

integer  values  1,2 k.  Hence,  one  criterion  of  the  efficiency  of 

the  procedure  R is  the  expected  value  of  the  size  of  the  subset.  Now, 
we  derive  an  expression  for  E(S|R^),  the  expected  size  of  the  selected 
subset  using  procedure  R,.. 


e(s|r5) 


E P{Selecting  the  population  with  parameter  0,.i} 
1 = 1 LU 


= E P(Y,.v  > max  Y,.*  - d} 
i = l 11  ; 1 < j <k  U; 


■ i,  C I,:,  G»(y  * d + 9m  • etji) 

Jl'i 


9n(y)  dy  (2.2.16) 


If  we  set  the  m smallest  parameters  9.  (1  <_  m < k)  equal  to  a common 
value  0(say)  and  define 


HI 


[m] 


Q - E(S  | 0 


. . . = 0 


0) 


(2.2.17) 
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then  by  an  analogous  argument  as  in  Gupta  [**l]  one  can  prove  the 
following  theorem. 

Theorem  2.2.h.  For  given  k,  P*(£  < P*  < 1),  the  expected  size  of  the 
selected  subset  E(S  | 0^j  = 0^  = ...  = 0^  = 0,  m < k)  in  using  the 
procedure  is  strictly  increasing  in  0. 

OO 

Corollary  2.2.2.  sup  Eft (S | R^)  = k j Gk"1 (y+d)  g (y)  dy  = k P*. 

0en  - 5 * n n 

Corollary  2.2.3.  In  the  subset  ft(6)  = (0_:  £ 0^j  - 6, 

* = l»2 k-1},  the  function  Eq(s|r^)  takes  on  its  maximum  value  when 

9[j]  = 0jkj  - 6,  i = 1,2,..., k- 1 , and  so 

00 

sup  E„  (S  | Rj.)  = / Gk  '(y+d+6)  g (y)  dy 

9en(6)  £ 5 -co  n n 

00 

+ (k-1)  / Gk  2 (y+d)  Gn (y+d-6)  9R(y)  dy  . 


(G)  Minimax  Property  of  the  Rule  R.. 

Suppose  that  yj yk  are  the  sample  medians  from  the  k populations 

it  , . . . , respectively,  and  with  this  set  of  observations,  we  select 

the  ith  population  with  probability  <t>.  (yj , . • • ,yk) . Then  the  selection 
rule  R is  said  to  be  invariant  or  symmetric  if 


^(y, Yj.---.yj yk)  = <>j(y1....,yj,...,y|,...,yk) 

for  all  i and  j,  i.e.  i f yj  is  observed  from  tt.  and  y.  from  tt.  , then 

we  select  the  jth  population  with  the  same  probability  d) { (y ^ y^). 

Notice  that  the  rule  R,-:  Y.  > max  Y.  - d satisfies  the  equations 

5 ' l£l<k  J 


Inf  P (CS| R ) - inf  P (CS| R ) - P (CS | R ) - P* 

0efi  £ 5 0til  i 5 5 

— — o 


(2.2.20) 


E0  (S|R-)  - E (S|R  ) = k [P  (CS | R' > - P (CS| R )]  (2.2.22) 

2o  2o  3 -o  -o  3 

If  the  rule  R'  satisfies  the  basic  P*  condition,  it  follows  from  (2.2.20) 
that  the  right  hand  side  of  (2.2.22)  is  non-negative.  Thus 

E0  (SIR'>  1 Ee  (S|RJ  = sup  E (S | R ) . 

-o  -o  3 - 3 

So  that  sup  Eft  ( S | R')  sup  Efi  ( S | R^ ) 

0eft  0 3 

i.e.  the  rule  Rj.  is  minimax  among  all  invariant  rules  satisfying  the 
P*“ cond i t i on . 


2.3.  Selecting  the  Population  with  the  Largest  Locat?< 
Parameter  - Indifference  Zone  Approach 


In  this  section,  we  would  like  to  use  the  indifference  zone 


approach  of  Bechhofer  (11]  to  select  one  population  which  is  guaranteed 
to  be  associated  with  the  largest  location  parameter  with  a fixed 
probability  P*  whenever  the  unknown  parameters  lie  outside  some  subset, 
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or  zone  of  indifference,  of  the  entire  parameter  space.  The  goal  is 
to  define  a sequence  of  rules  (R^(n)}  each  of  which  selects  a single 
population  and  find  the  smallest  n so  that 

Pg(CS|R6(n))  > P*.  V 0efi(6*)  = {0:  9[k]  “ 6[k_,  ] 6*>  (2.3.1) 

where  P*  and  6*  are  preassigned  numbers. 

For  the  sake  of  clarity,  we  will  use  the  notation  Yr.  , to  denote 

L k J n 

the  largest  of  the  sample  medians  each  based  on  (2n+l)  observations. 
Rg(n):  Select  the  population  corresponding  to  Y[j<]n* 

Let  ^0(6  ) = (9:  0^j  = • ••  = ®[k-l]  * ®[k]  ” ^ )•  Then  we  have  the 
following  theorem. 

Theorem  2.3.1.  inf  Pn  (CS| R, (n) ) = inf  P„(CS|R,(n)) 

0efi(6*)  - 6 0efi  (6*)  § 6 

- o 

Proof.  For  0 £ ft(6“). 


P§(CS|R6(n))  - V V(j)n  < Y(k)n} 


l<j<k-l 

' VY0)"  ‘ Y(k)n’  J ' '•* k-’> 

‘ YV(j)n'e[j]  * Y(k)n'°(k]+etkJ*6 tj J ’ 1 " 1,2 k‘l) 


oo  j—  k-1  — 

J n G (y  + 6 ) d G (y) 

-oo  j = l n Kj  n 


(2.3.2) 


where  Gn(y)  = ^n(v*  0)  is  the  cdf  of  the  sample  median  of  (2n+l ) 

independent  observation  from  the  standard  double  exponential  distribu- 

1 -Ixl 

t ion  with  dens  i ty  function  je  1 l,-°°<x<00,and  6^  = 9[k]~9 [ j ] — 
Hence  the  infimum  of  the  probability  of  a correct  selection  occurs  when 

9[1]  = 6[2]  = *•*  " 6[k-l)  = 6[k]  ' Provided  0[k]  " 9[k-l]  1 6*' 

This  proves  the  theorem. 
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The  minimum  sample  size  required  to  achieve  the  P*  condition 
(2.3.1)  is  the  smallest  integer  n such  that 


/ [Gn(y  + 6*)]k"'  d Gn(y)  > P*  . 


(2.3.3) 


2 .4  Selecting  the  t-Best  Populations  - Indifference  Zone  Approach 
Now,  we  consider  the  problem  of  selecting  the  best  t populations, 

i.e.,  the  populations  with  location  parameters  0 [k-t+1 ] »®[k-t+2] ^[k] 

without  regard  to  order.  We  are  using  the  indifference  zone  approach 
based  on  the  sample  median  Y.  of  2n+l  independent  observations  from 

population  tt.,  i = 1 k.  Define  a sequence  of  procedures  as  follows: 

R^(n):  Select  the  t populations  associated  with  t largest  values  of  Y. . 

Let  ft'(<5  ) = {£:  © fk— 1+1 1 ” ®[k-t]  — ^ and  *et 

- <0:  9n].....e[k.t].e,  9[k_t+|]  = ....0[k]  - 6+6*}. 

Theorem  2.4.1.  inf  Pn(CS | R., (n)  } = inf  . P^ (CS | R-, (n ) } 

0efT(6*)  - 7 een'(6")  - 7 

Proof.  It  was  shown  in  Theorem  2.2.3  that  the  pdf  gp  (y ; 0)  of  the 

sample  median  has  monotone  likelihood  ratio  in  y,  which  implies  that 

it  is  stochastically  increasing  in  0.  Using  a theorem  of  Barr  and 

Rizvi  [8],  it  follows  that,  for  6 e S2'(6,!) 


Pfi{CS | R7 (n) } = P„{  max  Ym  < 
£ 7 — 1 < i <k- 1 ' 


k-t+1 <j<k 


Y(j)> 


is  a non-increasing  function  of  »****®[k-t]  and  a non_c'ecreas' n9 
function  of  0[k_t+j j 1 •• • *9[k] • Tfius  Pg(CS|R^(n)}  attains  its  infimum 
when  0 ^ j ] , . . . ,0 t ] attain  their  maximum  possible  values,  while 
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® [k_t+l  ] » • • • attain  their  minimum  possible  values  subject  to 

9_  e ft" (6*).  The  proof  is  thus  completed. 

Using  the  same  notation  as  in  Section  2.2,  let  (y ; 0 . ) denotes  the 

cdf  of  the  sample  median  Y.  with  parameter  6..  Since  6.  is  the  location 

i i i 

parameter,  ^(y;  9 - ) = G^  (y  - 0.;  0)  and  is  stochastically  increasing 
continuous  in  both  y and  0..  For  0 e ^'*(6*) 


P0{CS | R? (n) } 


P { max  Y,n  < min  Ym> 
- 1 <i <k-t  V 1 k-t+1 <£<k  w 


= Pq{  U { Y y . v = min  Ym  and 
- j=k-t+l  k-t+1  <K,<k  ' ' 


max  Y,.x  < Y / . v } } 
1 <£<k-t  (l)  (j) 


k 00  k-t  k 

E f n G (y;0r  ,)  II 
j=k-t+l  -oo  3=1  lPJ  a=k-t+l 

a^j 


0 - Cn(yiO[a,)>  d Gn(y;  6 (J , ) 


In  particular,  for  Q_  e 


Pq(CS  | R?  (n)  } = t / G[j“t(y;0)  {1  - Gn(y;0+6*)}  dGn(y  ,0+6*) 


= t J G^_t (y-0;O)  { 1 -Gn (y-0-6" ; 0 ) } dGn(y-0-6  ;0) 

—00 

00 

= t / G^_t (y+6*;0)  {I  - Gn(y;0)}t"1  dGn(y;0) 

— 00 

which  is  independent  of  the  parameter  0.  Hence  for  specified  values  of 
6 and  P ( —r—<  P < l),  we  can  solve  the  equation 


t / G^_t(y  + 6*;  0)  {1  - Gn(y,  0) }t_1  dGn(y;  0) 


for  n. 


70 


2.5  Subset  Selection  with  Respect  to  the  Scale  Parameter  o 


Let  X. , i = l,2,...,k  be  k independent  random  variables  from  double 
exponential  population  tt. , i = 1,2,...,k,  respect i vel y , with  tt.  having 


the  probability  density  function 


f (x;0.  ,o. ) = exp  [-  |x-0.  | /a.  ] , -°°  < x < °°,  - » < 6.  < °°,  a.  > 0. 

Take  n independent  observations  from  tt. , i = 1,2 k.  From  these 

data  one  wishes  to  select  a subset  contains  the  population  with  the 
largest  0..  Let  £ •••  ^ °[k]  t^ie  or<^ere^  parameters.  We 

consider  two  different  cases. 


Case  (i):  9 j ,6^, ... ,6^  known. 

In  this  case,  the  maximum  likelihood  estimator  of  a.  is 

’ i 

1 ° 

Y.  = — Z lx..  - 9.1  which  is  distributed  as  a gamma  variable  with 
i n . . 1 i j i 1 

J=l  . 21 

o.  n-1  a. 

parameters  n and  , i.e.  Y.  has  density  q /p e 1 , y > 0. 

Thus  the  problem  reduces  to  the  one  considered  by  Gupta  [1»0]  . The 

selection  procedure  is 

R:  Select  the  population  tt  j in  the  subset  if  and  only  if 

Y.  > c max  Y . . 

' 'Ijik  J 

Case  (i  i ) : 0.'s  are  unknown. 

When  9.  is  unknown,  it  is  well  known  that  the  maximum  likelihood 
' a , n 

estimate  of  O.  is  given  by  a.  = — Z |X..  - X.|,  where  X.  denotes  the 
i 1 n j*i  'J  ' 1 

sample  median  from  population  it..  For  this  problem,  we  propose  the 
following  selection  procedure. 

Rg:  Select  the  population  tt.  in  the  subset  If  and  only  if 


0,  > Co  max  0. 
’ “ 8 1 <j  <k  J 


■ 


M 
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where  0 < Cg  < 1 i s so  determined  as  to  satisfy  the  basic  probability 

requirement  regardless  of  what  the  unknown  0. 's  may  be. 

£ I 

no. 

Let  V.  = , i = l,2,...,k.  Then 

i 


P(CS|Rg)  = P{o(k)  > c8  ^max^  0(j)> 


= / 


V FV  < r x)]  dFv  (x) 

U-l  0)  C8  °[j]  J V(k) 


So 


inf  P(CS|  Rft)  = inf  P(CS|Rft)  = / F*"1  (•— ) dF  (x) , 

oefi'  8 oefT  8 o V c8  v 

— — o 

where  iT  = {o  = (a.,...,o.  ),  a.  > 0,  i = l,...,k}, 


n'  * {a  = (a a),  o > 0}  and  Fw ( * ) , F (•),  j = 1 k are  the 

° V V(j) 

cdf's  of  V = — , V/.v  = 

o 0)  °tJ] 


, j = k,  respectively. 


Hence  if  the  distribution  F^(*)  is  known,  then  the  constant  c 8 
can  be  determined  by  the  equation 

f F5_1(fr)  dFv(x)  = P*. 

0 c8  v 

The  exact  distribution  F of  V is  worked  out  for  n = 3 by  Bain  and 
Engelhardt  [ k ] , and  a chi-square  approximation  is  also  given  by 
them  which  is  quite  good  even  for  small  n.  However,  it  follows  from 

/v 

Chernoff,  Gastwirth  and  Johns  [22],  that  - — (V-n)  = r'nT  [ 2.  - |] 

Sn 

is  asymptotically  a standard  normal  variable.  When  all  o.  are  identical 


p{ak  1 Cg  Oj  , j = 1 k-1 } 


= P{i/n(— ■ - 1 ) ^ Cg  </rT(-^i-  - 1 ) + /rT(cg-l ) , j = 1 k-1 } 


°°  k-1  x-/rT(co-l ) 

= / 4>  ( ) d4>(x) , 

c8 


2.6  A Test  of  Homogeneity  Based  on  the  Sample  Median  Range 

Let  TTj  ,tT2,  . . . be  k independent  double  exponential  populations 

such  that  the  observations  X. .,...,  X.  „ ..  from  tt.  has  density 

il  i , 2n+l  i 

yn  , for  i =■  tk.  As  before,  let  the  sample  median  of 

these  (2n+l)  observations  be  denoted  ns  Yj , I - l,...,k.  In  some 

practical  situations  one  wishes  to  know  whether  0.  are  significantly 

different  or  not.  This  problem  is  to  test  the  homogeneity  of  the  double 

exponential  populations.  We  are  interested  in  using  a test  based  on  the 

sample  range  of  Y's  and  hence  we  wish  to  derive  the  distribution  of  the 

sample  median  range  R = max  Y.  - min  Y.,  considering  all  9.  to  be 

l<j<k  J 1 <j<k  J ' 

equal  to  a common  unknown  0.  When  the  value  of  R is  large,  the 

hypothesis  of  homogeneity  is  rejected.  We  wish  to  find  a constant  r, 

such  that  P(R  > r)  <_  a under  the  hypothesis  Hq:  0j  = ...  = 0L  = 0.  This 


will  provide  an  a-level  test. 


Theorem  2.5.1.  For  a,  0 < a < 1,  let  r be  a constant  such  that 


Pn  {Y.  > max  Y.  - r}  > 1 - p . 
fio  k " 1<j<k-l  J " k 


Then  (R  > r)  <_  a. 


Proof.  When  H is  true,  i.e.,  under  ft  , 
o ’ o 
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P(R  > r)  * P(  max  Y.  - min  Y.  > r} 
l<j<_k  J 1 <j  <_k  J 

k 

<_  k - I ?{Y.  > max  Y.  - r} 

i = l ' l<j<k  J 

= k - k P{Y  > max  Y.  - r } 

k - l<j<k-l  J 

< k - k.  (1  - j£) 

= a. 

The  above  theorem  establishes  a connection  between  the  selection 
rule  Rj.  and  the  above  test  for  equality  of  0's. 

2 . 7 On  the  Distribution  of  the  Statistic  Associated  with  R.. 

Let  X.  (i  = 0,1, ...,p)  be  (p+1 ) independent  and  identically 

distributed  random  variables  each  representing  the  median  in  a random 

sample  of  size  (2n+!)  from  a population  with  standard  double  exponential 

1 — I x I 

density  function  f(x)  = *-  e 1 1 . Consider  the  differences  Y.  = X.  - X 

z i i o 

(i  = 1,2 p).  The  random  variables  Y.  (i  * l,2,...,p)  are  correlated 

and  the  distribution  of  the  maximum  of  Y.  is  of  interest  in  problems  of 

selection  and  ranking  for  double  exponential  distribution  as  explained 

earlier  when  discussing  R^.  In  this  section,  we  give  a closed  form  of 

the  distribution  of  Y = max  Y.  for  some  special  cases.  We  have  also 

l<i<p  ' 

computed  tables  of  the  upper  percentage  points  of  Y = max  Y 

l<i<p 

corresponding  to  the  probability  levels  a = P"  = 0.75.  0.90,  0.95,  0.99 
for  p = 1(1)  9,  n = 1(1)10. 

For  the  special  case  P = 1 (k=2) , n = 1 (sample  size  = 3),  straight 
forward  integration  gives  the  cdf  of  Y(see  formulae  (2.2.2),  (2.2.3)) 


as 
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oo 

P(Y  <.  y)  = / G(x  + y)  g(x)  dx 

— OO 


Again,  for  p = 1 (k=2),  n = 2 (sample  size  * 5), 


P(v<y).,.75ye-3y.2||ye-A,.4|rye-5,^e-3y 


5225  -4y  203  -5y 

■“55Te  "25Te 


All  computations  related  to  and  given  at  the  end  of  this  chapter 
were  made  on  a COC  6500  using  Gauss  Laguerre  quadratue  based  on 
fifteen  nodes  to  perform  the  numerical  integration.  Checks  on  the 
accuracy  of  the  program  for  p = 1,  n * 1 showed  that  these  values  seem 
to  be  correct  to  three  decimal  places. 


Upper  luu(l-P  ) percentage  points  of  Y = max  (X.-X  ) where 


3.7762  2.7371  2.2101  1.8855  1 6729  1.5207  1 .4088  1.3232  1.2537 


Table  VI  (continued) 
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CHAPTER  I I I 

SOME  CLASSIFICATION  RULES  FOR  k UNIVARIATE  NORMAL  POPULATIONS 

3. 1 Introd uction 

In  problems  of  classification,  one  usually  assumes  that  an 
I ii.l  t '/ 1 Uuo  1 liolfiit'jQ  tn  tif,o  nf  I lio  I-  | ii  i|  m 1 n t 1 1 in  q , It^qpd  nn  the  nlisorva- 
lions  from  these  populations  one  wishes  to  assign  It  to  the  correct 
population.  Such  problems  of  classification  often  arise  in  several 
branches  of  science.  About  forty  years  ago  Fisher  was  consulted  by 
Barnard  [ 6 ] as  to  the  best  method  of  classifying  skeletal  remains 
unearthed  by  archaeological  excavations.  Fisher  [29]  suggested  the  use 
of  the  now  well-known  discriminant  function.  A general  mathematical 
theory  of  statistical  taxonomy  was  built  by  Welch  [79 ] on  foundations 
laid  by  Neyman  and  Pearson's  theory  of  tests  of  hypotheses.  The 
technique  of  discriminant  functions  which  was  devised  by  Fisher  [29] 
has  proved  to  be  invaluable  in  tackling  classification  problems,  but 
the  construction  of  the  discriminant  function  is  possible  only  when 
we  know  the  values  of  the  parameters  characterizing  the  populations 
to  be  discriminated  between.  This  raises  the  question  as  to  what  is 
to  be  done  when  such  knowledge  is  absent.  In  this  chapter  we  describe 
some  classification  procedures  suited  to  such  situations. 

The  problem  of  classifying  an  individual  Into  one  of  two  categories, 


discriminant  function  analysis  as  some  prefer  to  call  it  in  the  parametric 
case,  has  been  considered  by  many  authors  in  the  statistical  literature. 
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For  an  extensive  bibliography,  the  reader  is  referred  to  Anderson, 
et.al.  [ 1 ].  The  probability  of  mi scl ass i f icat ion,  inherent  in  such 
classification  procedures  is  not  necessarily  known  to  the  experimenter. 
Several  authors  have  treated  the  problems  dealing  with  the  misclassifi- 
cation  (e.g.,  see  Smith  [74],  John  [5 1 ],  [52],  Okamoto  [59], 

Sedransk  [73] , Hills  [48],  and  Sorum  [76]).  It  should  be  stressed  that 
the  above  papers  deal  with  the  case  of  two  populations  only. 

In  this  chapter,  we  use  the  subset  selection  approach  to  the 
problem  of  classification  where  the  probability  of  correct  classifica- 
tion ( P ( C C ) ) is  guaranteed  to  be  at  least  a preassigned  number  P" 

< P*  < l)  regardless  of  what  the  unknown  state  of  nature  might  be. 

The  classification  rules  proposed  here  are  different  from  those  con- 
sidered by  Gupta  and  Govi ndarajul u [37]. 

Let  7T . denote  a normal  population  with  an  unknown  mean  0.  and 
2 

variance  07  (i  = 0,1,.. .,k).  From  population  it.  one  observes  a random 
sample  X . j , j = l,...,n.,  i = 0,1,..., k.  Based  on  the  above  data  we 
allocate  n to  one  of  the  k populations  with  respect  to  the  mean,  variance, 
and  the  reciprocal  of  the  coefficient  of  variation.  In  each  case  it  is 
assumed  that  the  parameter,  for  example,  the  mean  0q  of  ttq  is  equal  to 
one  of  the  0|,  1 = ),...,k. 


3.2  Classification  Rules  with  Respect  to  the  Mean 

2 

Case  (1).  Common  known  variance  0 . 

2 

Without  loss  of  generality  a will  be  taken  to  be  unity.  Let 
n. 


X,  - E 

nl  j-1 


i j * 


I 


0,1 , • • • , k • 


(3.2.D 
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Then  we  propose  the  following  classification  rule, 

R„:  Classify  n as  tr  If  and  only  if 

9 o i 


IX 


' x=l  i c9 


(3.2.2) 


where  c^  is  chosen  such  that  P(CC|Rg)  is  at  least  P*  (^-  < P*  < 1)  which 
is  a specified  number.  Then  P (CC | Rg ) Is  given  by 


p<cciy  • ^ p<i*,-*0i  i s nAt  + tt:  i ei  • v p(9i  ■ y 


= Z [2<i>(cq)  - I]  q. 
i=l  y 1 


2*(c9)  - I 


(3.2.3) 

where  $(•)  denotes  the  distribution  function  of  the  standard  normal 
variate  and  q.  is  the  a priori  probability  of  6j  = 0o>  1 ■*  l,...,k. 
Since  P (CC | R^ ) P*,  we  obtain  Cg  as  the  smallest  non-negative  number 

which  satisfies  the  equation 


2$(cg)  - 1 = P 


or  equivalently 


(if) 


(3-2. A) 


Theorem  3.2.1.  Let  A.  be  the  event  "classify  ttq  as  when  0;  + 0Q". 
Then  P(A.  |Rg)  -*-0  as  each  of  nQ,  n.  . 

pr°°f-  p<*t  l"9>  ' P<IVX„I  T I 9I  * 9o> 


c9  * 


I 

0 -0.  7,-7  -(0|-0rt)  0 -0. 

-2..  1 ■ < J — 2 L.°.  <Cg+  — 2 — i 


A_+  [J7L  /]_  + i 

V"  1 no  ^nl  no  ^"i  nc 


-|0.^0 


r o 


-►  0 as  n , n,  00  • 
o 1 


(3.2.5) 


o 

Case  (Ii).  Common  unknown  variance  a . 

.f,  xu-  s2'7  ,f0  f,  (x,j  -xi»2- 


where  V «*  E (n.  - 1).  In  this  case,  we  propose  the  classification 
i =0 

rule  as  fol lows : 


Ijq:  Classify  ttq  as  tTj  if  and  only  if 

|X.  - X | < c . _ s J—  + — 

1 I o'  — 10  ^ n . n 


(3.2.7) 


where,  as  before,  c^q  is  determined  as  the  smallest  non-negative  number 
which  satisfies  P (CC | q)  > P*.  Here,  we  have 


P(cc|«,0)  - i p(|x.-xo|  < c,0  s STL  I 8,  - e0)  P(e,  - eo) 

1*1  ▼ i n 


E P<|T,I  £ e]0)  q. 


i=l 


(3.2.8) 


where  T. , i = l,...,k  are  identically  distributed  (not  independent)  as 
Student 's-t  with  V degrees  of  freedom.  Hence 


P(CC|R,0)  - P ( I T I £c|0) 


where  T is  distributed  as  Student's-t  with  V degrees  of  freedom, 
should  be  pointed  out  that  the  joint  distribution  of  T. , i = 1,.. 
is  a multivariate  t as  studied,  for  example,  in  Gupta  [39].  The 
covariance  matrix  is  E = (°jj)  where  a^.  = a for  i = j and 


2 


o 


(1  + ^)(1  + 

n. 


j- 


It 

,k 


P<CC|K10).  Ft(c,„)  - Ft(-c,0)  - 2Ft(c,0)-I 


(3.2.9) 
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where  F^. ( • ) is  the  cumulative  distribution  function  of  T and  Cjq  is 
determined  to  be  the  smallest  non-negative  number  which  satisfies 
the  equation 


c 


10 


(3.2.10) 


If  A.  is  the  same  event  as  defined  in  Theorem  3.2.1.,  then  we  have  a 
similar  result  for  the  rule  ^ . 

Theorem  3.2.2.  P ( A . I g ^ ® as  eac^  nQ»  n j » • • • 00 • 

Proof.  The  proof  is  straightforward  and  hence  omitted. 


3.3  Classification  Rule  with  Respect  to  the  Variance 

It  is  assumed  that  the  population  tt.  has  unknown  mean  0.  and 
2 

variance  a.,  i = l,...,k.  As  before  we  assume  that  one  of  the  variances 
2 2 

c. , i = 1,2.  ,.,k  is  equal  to  Oq.  We,  then,  wish  to  allocate  ttq  to 

one  of  the  k populations  TTj,...,TT  with  respect  to  the  variance.  Assume 

we  take  nj  ^ observations  from  tt . , i = 0,1,..., k.  We  propose  the 

classification  rule: 

R. , : Classify  tt  as  ir.  if  and  only  if 

11  ' o i 


no(nl  ~ 3) 
ni(no  " ]) 


s 

-2.  - | 

2 1 

si 


±c11 


n. 

i 


where  s?  = — E (X..  - X.)^,  1 **  0,1 
• n.  j=1  U • 

non-negative  number  which  satisfies  the 


,...,k  and  Cjj  is  the  smallest 
inequality  P(CC|R^)  _>  P . We 


have 
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P(CC|Rn)  = E P{ 
i*=l 


k I n (n,-3)  s' 


n"(nJ-n  *7  ' 1 - C1 1 ai“°o}  P{oi“°o} 

i o s , 


k n s^/[a^(n  -I)]  n . - 1 1 n.-l 

O O o'  O i ^ I 


E P{ 


< — rr  c, , a«cr  }p(o.=o  } 


L O O _ _ o _ _ n ^ii  v.— v j r iv,— v J 

• i ^ « i\i  n.  J — n.  j II  i o i o 

•=1  I n.s./la. (n.-l ) J i i 


k 

= 2 q 
i = l 


•{F  i i (1+c.i)  - F , if“* 

i no-l,nj-l  n|_3  11  n -l,n.-1|n. 


■3  (1-CI1)  } 


(3.3.2) 


where  F is  the  cdf  of  an  F random  variable  with  v.  and  v„  degrees 

1*  2 1 
of  freedom,  and  qT  is  the  a priori  probability  of  o.  = oq.  In  the 

special  case  when  nj  “ n2  = •••  = nk  = n*  (3.3.2)  becomes 

P<CCIR,1>  - Fn0-l.n-l  Sj<'«ll>  -F„o-l,„-,  ST  l>] 


(3.3.3) 


and  Cjj  is  the  smallest  non-negative  constant  determined  from  the 


equat i on 


- . . I — ( 1 +c , , ) -F  . . ~ (l-cn)  = P*. 

no-1,n-l  Jji-3  _|  nQ-l,n-1  n-3  11 


In  using  the  rules  R^,  R^,  R^,  we  might  classify  ttq  as  none,  one, 
two  or  k of  the  k populations.  In  the  following  sections,  we  use  the 
subset  (non-empty)  selection  approach  to  propose  rules  that  will 
classify  to  be  at  least  one  of  the  k populations.  This  overcomes 
the  objection  that  7Tq  may  not  be  classified  as  any  one  of  the  k 
populations. 
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3.**  A Subset  Selection  Approach  to  Classification  Rule 
with  Respect  to  the  Mean 

2 

Without  loss  of  generality,  we  may  assume  that  a =1,  Again  we 

assume  that  there  is  exactly  one  population  with  0.  = 0 . In  this 

r i o 

case,  we  propose  the  rol lowing  classification  rule 

R..:  Classify  tt  as  tt.  if  and  only  if 
12  ' o i ' 


I X s " xj  — c 1 2 m*n  I X . X | 
1 ° 1 < j <k  J ° 


(3-4.1) 


where  Cj2  ) is  the  smallest  number  which  satisfies  the  inequality 

p(cc|r12)>  p\ 

The  classification  rule  R^  has  the  following  desirable  asymptotic 
property,  i.e.,  the  probability  of  misclassif ication  approaches 

zero  as  the  sample  sizes  nQ , n ^ n^  become  large.  Before  we  prove 

this  we  need  the  following  lemma. 

k k 

Lemma  3.4.1.  P(MC|R,J  < E E P(|X:-Xj  > c10  |X.-X, 

iz  j=1  j=]  i o j 

j*i 

where  MC  denotes  mi scl ass i fi cation. 

Proof.  Since 


0.  = 0 ) 
i o 


P (MC I R| 2 ) = P^7To  's  nQt  c,assif,ed  as  l9i=eo^P^0i“eo^ 


E P{|X.-X  |>cI0  min  |X.-Xl 
• i 1 i o'  12,...,  ' j o' 

i=l  J 


0.-0  }-P{0.=0  } 
I O I o 


< E P{|Xj-Xo|>c12|X.-)To|  for  some 

i = l J 

k k 

< £ £ Pt|X,-T0)  > c,2  IXj-XJ 


0.-0  } 

I O 


i=l  j=l 

J**i 


6i  ■ V 


(3. *.2) 
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where  Z. , i — 0,1, ...,k,  are  iid  standard  normal  variates. 

-*  UJ  J r*(^jlx  ($-v  -J5-X)  * >e<e-e» 
i-i  j=i  -«  nrr  L Vno  ci2  'ni  *no  J ° J 

j,“  f£x 

■ 7^  (Jf  y-^x)+^j(0o-0j))  d<I)(y)d$(x) 

+ [t(^x*4(^x‘ ^v)  t^(vej)) 

- 4>(  aM-  x - -7— ( J-j-  x-  J-i-  y)Wne  -9  ))  d<J>(y)d<l>(x) } 
»"o  c12  »no  Wni  J 0 J _ 

k k 

= Z Z {1,(1, j)  + l,(i,j)} 
i=l  j=l 

j*4! 

where  lj(i,j),  l2('»j)  are  the  first  and  second  double  integrals  inside 
the  double  summation  signs. 


Now,  for  every  e > 0,  there  exists  6(e)  such  that 


J d<J>(x)  < e 

| x | > 6(e) 


(3  - *♦ . 7) 


Hence,  we  get 
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6(e)  6(e)  , [rT.  . IrT. 

I.(i.j)  < J / [*((I  - r-ur1* + T-  Jt^y  + ^*.(0  -0.)) 

1 -6(e)  -6(e)  C12  Nno  c,2  h/n.  j o j 


- *((1+  — — ) M- x J—  /— j-  y-h/rT.  (6  -6.! 

c]2  *K  C1 2 ^n!  J ° J 


d<t(y)d<t>(x)  + e^ 


6(e)  + /^Tj  (eo-ej ) ) 


6(e)  - /nT  , /?T 

</  [*((1-  - — ) M-  x+  - — /— L 6(e)  + /rTj  (0  -0  .)) 

“ -6(e)  C1 2 ^no  C1 2 Vni  J ° J 

- $((1+  — J— ) M-  x-  — J—  -j I-  6 (eJ+i^T.  (9  -0 . ) )]  d<t>(x)+e^ 

C1 2 ^no  C1 2 Vni  J ° J 

< 4>((1-  —— ) /■ -j-  6(e)  + — J — 6(e)  + /rT  (0  -0  )) 

C1 2 Vn0  c,2  Vni  J o j 

- <I>(-(1+  —— ) 7-i-  6(e)  - M-  6(e)  + /nT  (0  -0.))  + e2. 
C1 2 ' no  C1 2 V n i J ° J 


(3.4.7) 


For  every  n > 0,  there  exists  an  N(e,n)  such  that  for  N > N(e,n), 

4>((i-  --)  /r*I- 6(e)  + -r—  ^-<S(e)  + VnX.  (6  -6.)) 

C1 2 c12  VX|  J ° J 

- $(-(!+  -—-)  M-  6(e)  - jM*  6(e)  + (6  -6.))  < n 

c12  VXo  c12  VXi  J o j - 

(3.4.8) 

Now  since  e and  n are  arbitrary,  lj(i,j)  + 0 as  N + ».  Using  an 
identical  argument  one  can  show  that  l2(i,j)  -*•  0 as  N -*■  ®.  Thus, 
P(MC|R12)  ■+  0 as  N + ®.  This  completes  the  proof  of  the  theorem. 

Suppose  nQ  = nj  = . . . = n^  = n,  we  have  the  following. 

Theorem  3.4.2.  If  Cj2  is  chosen  to  be  the  smallest  constant  such  that 


fix.  x . 1 . 1 - P 

/0  *(  d4>(x) 


(3.4.10) 
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p(cc|r12)  > p . 


(3-^- 1 1 ) 


Proof.  Since 

?(cc|r,zH-p(mc|r12) 

k k _ _ 

> 1 - E E {P[X  - (X.-X  )<X.<X  + -i-  (x.-x  ),  x.>x  e.=0  ] 

i=1  j=l  C12  ° J ° C12  1 ° ' ° 1 ° 


+ p[x  - -J—  (x  -x.)<x.<x  + (x  -J.),  x.<x  e.=*e  ]} 

o c]2  o r J o cJ2  o i * 1—0  I oJ 


k k 


= i - e e {p[z  - (z.-z  )+^(e  -e.)<z.<z  + — (z.-z) 

1-1  j-l  ° c12  ' ° o j j o c|2  , o 

♦ X (Vej>-  Zi  >Zo] 

+ p[z  + — — (z.-z  )+vW(e  -e.)<z,<z  - — (z.-z  ) 

o C|  2 i o o J JO  c)2  I o 

+ ft  (eo-e j ) , z.  < zQ]}  (3^.  12) 

where  Z,,  1 = 0,1,.. .,k,  are  iid  standard  normal  variables.  But 

p[zo^(0o-9j)-;i-(zI-zo)<Zj<zo^eo-ej)  * J-  (z.-z^,  z,  -zQ  > o] 


jt  {zrzo]  ft  (z 

< P - — — ! — — < Z.  < — 

c12  V2  J C12 

= / x)  - <J(-  x)  d4>  (x) 

0 1_  c12  C1 2 J 

00  y— 

= J 2<I>  ( — • x)  - 1 d$(x)  . 

0 L °12 


(zrz<,>  zrzo 


(3.«<-13) 


Siml larly. 


p{zo+  <zj < V ^(vz^^-e.),  zrzo  1 °> 


o fc 

< / [24>(-  ~x)  - 1]  d*(x) 


/ [2»(^-x)  - 1]  d$(x). 

0 C1 2 


(3.4.14) 


So  that 


k k 00 


P(CC|R  ) > I - 2 E E / [2*(is-x)-l]  d$(x) 

i=1  j-1  0 C1 2 

j*i 


I - 2k (k-1 ) / [2<I> (— — — x)  - 1]  d4>(x) 


(3.4.15) 


Hence,  for  any  P , let  c ^ be  the  smallest  non-negative  number  such  that 

J0  *(^2,,ll*(x)  ■ 

then  P(CC|R  ) > P*. 


3-5  A Subset  Selection  Approach  to  Classification  Rule 


with  Respect  to  the  Reciprocal  of  the  Coefficient  of  Variation 


Xj  “ ~ 2 X i = 0,1 k 

i j=l  J 


(3.5.1) 


-2  _ 1 - x2  n|  s|  „ 2 

Si  - W7  l (Xij  V ’ " xn  i 

°i 


(3.5.2) 


“i  = a.  coel 

I 


!lcient  o^  variation  * ^ (3.5.3) 


We  assume  a0  “ ^ to  be  known  and  further  that  there  exists  only 
o 

one  population  with  n j - o^.  The  classification  rule  proposed  in  this 


case  is 
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, Z /rT  a /rT.a.  Z.  . 

+ P{  (I - — ) a + /p-  + -i-i-  < -1  < (1+  — )a 

C1 3 ° c13Ui  C1 3UI  UJ  Uj  C1 3 ° 


z,  z. 


< a 


JVc 


C1 3Ui  c13Ui  Uj  ’ U1  '"°  Ui 


a.  = o^}] 


where 


^ (xi  " 9j)  ^*1 

Zi  » Ui  “a!  » k* 

i i 


S,  + S2  (say) 


Now, 


1 


✓n*.a 


k k oo  00  00 

s,  = S S / / / _ [$(0-~)aot+  v c v 

i*»l  j = l 0 0 a0v“*^ja0  c13  13  13 

jVi 


xt  - -1-0  - /STc,.) 


/rT.a  t 

^ !— 2 ✓nTa.)] 

C1 3 ° C13V  C13V  J J 


- $((!+  )a  t - 


d<K(x)  dF.(v)  dFj  (t) 


where  $(•)  is  the  cdf  of  unite  normal  variate  and  Fj(*)  represents  the 

2 

cdf  of  the  square  root  of  a X random  variable  with  (n.-l)  degrees  of 
freedom. 

For  any  e > 0,  there  exists  6(e)  such  that 


/ d$(x)  < e 

|x|>6(e) 


and 


J dF  (x)  < e for  r = l,2,...,k. 

|x|>6(e)  r 


Hence,  we  get 


!**l  j=*l  1 1 1 <6 (e ) I v | <6 (e)  | x | <6 (e) 

j*i 

, _ i/rTa  t __  . /n.a  t 

[*((!“  - — )a  t+  - — - + — — /rT.cx  .)-$>((  1 + *~)a  t~  - — _ 


C1 3 ° CJ3V  C13V 


J J 


C1 3 ° c13v  c>3n 


- v'rTja.)]  d4>(x)  dF.  (v)  dF^ (t)  + e3. 

k k 

EE/  / J 

1*1  j“l  1 1 1 «5 (e)  | v |<6 (e)  | x | <6 (e) 

j^i 

t 

[»((,-  -L)at+  /x7-  > /H) 

c13  13  J J 13 

i Ala  t , 

-<H(1  + -Hat-  - (a . J\.  + )/N)]  d*(x)dF  (x)dF  (t)  + e3. 

i3  ° ci3v  J J Q\y  J 


If  t,  v and  x are  bounded,  then  for  any  e > 0,  there  exists  an  N(e) 
such  that  whenever  N > N(e), 

i » A",  a t 

*(c-  rV  - KA  - tt->« 

C1 3 ° c13  J J 13 

AT  a t 

- *((l+cJ3)aot-  -(a.A>  -T^7),/R)  < e • 

Since  e > 0 is  arbitrarily,  hence  Sj  + 0 as  N + ®.  Similarly,  one  can 
show  that  S,  ■+  0 as  N -»•  «°.  Then  P(MC|Rj3)  ■+  0 as  N -*■  «. 


Theorem 


3.5.2.  Suppose  n^  ■ . . . * n^  = n.  If  c^  is  the  smallest  non- 


negative number  such  that 
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oo  oo  00  /not 

/ / {/  2$(~(-a  + 7 + -—-)t)dt(x) 

00  a v-/na  c13 
o o 

a v-/na  /- 

° ° , x /na 

- I 24> (— — (-a  + — + )t)d«l>(x)  + 24>(a  v-/na  )} 

‘ C.  , O V V o o 

-°o  i 1 3 

1 - P* 

dF  (v)  dF  (t)  < 1 + 


n — 


k (k-1 ) 


Then  P(CC|Rj3)  _>  P 
Proof. 


* 


p(cc|r13)  = l 


p(mc|r)3) 

k k 

> 1 - Z Z [P{- 


‘i 


i=l  j-l‘  C1 3 C13Ui 

j 

Z,  /na 


i 


/na 

n“  + a0~ 

c13Ui 

/na. 


U.  - U.  - c 
J J 


13 


i 


/na 


c13Ui 


+ a - * , 77—  > a - -77-°  ) 

C13U1  ° Uj  Uj  — o U j 


+ P{- 


i 


✓na 


t r — t 1 P—  + a - < rr^-  < — ~ 

C1 3 C1 3Ui  C13Ui  ° Ui  Uj  C1 3 


/na. 
-J 

J 

j 


Z , /na  /na . Z . 

! 2_  +n 1 1 

c13Ui  CI3°1  0 UJ 


/na 


• u7  < % - TT  » 


> 1 - Z Z [P{— ^ - 


/na 


Z. 


a 


/na 


y 

jVi 


+ P{ + 


i=1  J-r  C1 3 c 1 3U I c13Ui  Uj  C1 3 c13Ui 


C1 3Ui  ’ 


Z. 

tt—  >a 
U.  — o 

/na  Z.  a 

2-<  -L<  — ^ 


/na 

c 

U. 

1 

Z. 


/n  a 


C1 3 C13UI  C13U1  Uj  C1 3 c13Ui  c13Ui 


Z. 

1 


/rT  a 


u,  < % - >1 


r 
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a 


k k oo  oo  oo 

1 - E E [/  / / {*(“(■«  + 7 + v 

i = l j-1  00a  v-/na  c13  v 


-)t) 


j»*i 


1 * 

• - t2-'1'1  d«<*>  dF„M  dF„w 


m a v-/na 
a>  00  Q o 

+ /n 

0 0 -» 


l * *'nci 

{4»(J_(a  - * o.)t) 

Cjj  O V V 


, /na 

-4>(- — (-a  + £■  + — ) t)  }d4»(x)dF n(v)dF  (t) 

c^^ovv  n n 


where  ^n(#)  fs  the  cdf  of  the  square  root  of  a x random  variable  with 


(n-l)  degrees  of  freedom. 


/na_ 


l-k(k-l)  [/  / / {2*(-L.(-a  + £ + --£)t)-1}  d*(x)  dF  (v)  dF  (t) 

0 0a  M-/na  c13  n n 


a v-/na 

oo  OO  o O 

+ J J J 

0 0 -oo 


1 


/na 


{,-2*(-J_(-ao+  i+  d<Kx)  dFn(v)  dFn(t) 


So  if  Cjj  is  chosen  to  be  the  smallest  nonnegative  number  such  that 


v *^na 

...  _ 2$(^-(-a  + 7+  -~)t)  d*(x)-J 

0 0a  v-/na  c13 
o o 


a v-/na 
o o 


*'na 


2$>(— — (-a  + — 

c,3  o v 


+ -~)t)  d*(x) 


+ 24>(a^v-vfiaJ}  dF_(v)  dFjt)  < 1 + 


n ' ' — 


then 


R(cc |r,3)  > p" 
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CHAPTER  IV 

SELECTION  PROCEDURES  FOR  NEGATIVE  BINOMIAL  POPULATIONS 
A.  1 Introduction 

Let  X. , i = l,2,...,k  be  k independent  random  observations  from 

population  tt. , i = l,2,...,k,  respectively,  which  has  a negative  binomial 

distribution  with  parameters  r.  , p.  . To  select  a subset  of  populations 

which  contains  the  population  associated  with  the  largest  p.  when 

Tj  = ...  = r^  = r,  Bartlett  and  Govi ndarajul u [10]  proposed  a rule 

which  is  based  on  the  statistic  b min  X.  -X.,  where  b is  a constant. 

l<j<k  J ' 

However,  in  many  situations  one  may  be  interested  in  small  values  of  p. . 

In  this  chapter,  our  aim  is  to  select  a subset  of  k negative  binomial 

populations  which  contains  the  population  with  the  smallest  p;  based  on 

a statistic  of  the  type  c max  X.  - X..  It  should  be  pointed  out  that 

l<j<k  J ' 

although  the  two  problems  seem  similar,  they  are  not  equivalent,  i .e. 

the  procedures  proposed  here  cannot  be  obtained  from  the  above  paper 

[10].  In  Section  A. 2,  we  present  a result  which  gives  a conservative 

constant  for  the  unconditional  rule  we  propose,  which  is  based  on  an 

exact  computation  of  the  conditional  distribution  of  the  statistic 

c max  X.  - X..  In  Section  A. 3,  we  propose  a similar  rule  as  in 
1 <j<k  J 1 

Section  A. 2,  except  that  the  rule  is  conditioned  on  the  total  number  of 
k 

observations  T a E X..  We  obtain  a lower  bound  for  the  infimum  of  the 
i = l ' 


F/G  12/1 


AD-A033  362  PURDUE  UNIV  LAFAYETTE  IND  DEPT  OF  STATISTICS 
SOME  RESULTS  ON  SUBSET  SELECTION  PROBLEMS. (U) 

DEC  76  Y LEONG  N00014-75-C-0455 

UNCLASSIFIED  MIMEOGRAPH  SER-475  NL 

2 


ADAO 33362 


95 


probability  of  a correct  selection.  It  is  shown  that  when  k » 2,  the 
inflmum  of  this  procedure  is  attained  when  p^  «=  p2  ■ p,  and  it  is 
independent  of  p.  A method  leading  to  a conservative  solution  for  the 
constant  c(t)  depending  on  T = t is  also  given.  An  upper  bound  for 
the  expected  subset  size  is  derived  which  holds  for  all  values  of  the 
parameters.  The  problem  of  selecting  all  populations  better  than  a 
standard  is  also  considered  in  Section  *4. *4. 

*4.2  An  Unconditional  Subset  Selection  Procedure 

Let  X be  a random  variable  which  has  the  negative  binomial 
distribution  with  parameters  r,  p,  i.e.  X denotes  the  number  of  failures 
before  the  rth  success  is  observed,  p being  the  probability  of  a success 
in  an  independent  trial.  Then  X is  distributed  with  the  probability 
mass  function 

P (X  - x)  = (r+*  ')  pr(l-p)X  , x = 0,1,2,...  (*4.2.1) 

It  is  known  that  the  sum  of  n independent  and  identically  distributed 
negative  binomial  random  variables  with  parameters  r,  p is  again  a 
negative  binomial  random  variable  with  parameters  nr,  p.  We  may,  there- 
fore, think  of  the  selection  problem  as  being  one  of  picking  a subset 
containing  the  negative  binomial  populations  with  the  smallest  p value, 
based  on  a single  observation  on  X from  each  of  the  k populations. 

Suppose  then  that  X} , i = 1,2, ...,k  are  independent  observations 

from  populations  tt . , 1 *■  1,2 k,  respectively,  which  have  negative 

binomial  distributions  with  parameters  r,  pj . Without  loss  of  generality 

and  In  order  to  avoid  a more  complicated  notation,  we  assume  that 

P|  <_  P2  < ,,,  < p^.  Any  population  whose  parameter  value  equals  Pj  will 


96 


be  defined  as  a best  population.  A correct  selection  (CS)  is  defined 
as  a selection  of  any  subset  of  the  k given  populations  which  contains 
at  least  one  best  population.  Let  P{CS;  k,  £,  R)  denote  the  probability 
of  a correct  selection  when  the  procedure  R is  used  with  the  given  k 

and  when  the  true  configuration  of  parameter  values  in  £ * (p^ p^). 

Let  ft  be  the  space  of  all  configuration  £ “ (p^ Pk)  such  that 

0 < p j < 1 , i » 1 k. 

(A)  The  Rule  Rjt|  and  its  Associated  Probability  of  a Correct  Selection 
We  propose  the  following  selection  rule: 

R| ^ : Retain  population  tt.  in  the  selected  subset  if  and  only  if 


r + X.  - 


< d min 


r + X 

l<j<k  Xj 


(4.2.2) 


where  d^  f is  the  smallest  number  such  that  the  basic  probability 
requi rement 


inf  P (CS|R  .)  > P 
orfi  E. 


(4.2.3) 


is  satisfied. 


Letting  c^  = the  rule  in  (4.2.2)  becomes 
Rj^:  Retain  population  tt.  in  the  selected  subset  if  and  only  if 


X.  > c..  max  X.  - (1  - c.Jr 
' 14  l<j<k  J 1 


(4.2.4) 


where  0 c^  <_  1 is  the  largest  number  for  which  the  basic  probability 
requirement  (4.2.3)  is  satisfied  for  all  parameter  points  ja  = (P| P^) 


in  ft  . 
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Now  for  any  £_  e Q, 


p^(cslR|i,)  “ P£^x(i)  1 c| £»  x(j)  " 0-c)Jt)r,  j - 2,...,k) 


x+(l-cJit)r 


r'— 


o°  L 

r /r+x-1.  r x „ 
■ 2 ( ) p.  q n 

x-0  * 11  j-2 


H* 


E CT'lP-  q? 


y-0 


J J 


(l*.2.5) 

where  qj  ■ 1 - Pj , l • l,2,...,k  and  [a]  denotes  the  greatest  integer 
less  than  or  equal  to  a. 

Using  the  well-known  identity  for  the  incomplete  beta  function 
s - 1 


ir  T.  (r+j  1 ) pJ  - I (r,s)  “ PS  Z qJ  (4.2.6) 

j-0  J q j-r  J 


where  lq(r.s)  - r(r)rW 


/ tr  ^ ( I - 1 ) S ' dt,  it  follows  that 
0 


>s-l 


Vcs|v  ■ 1 <rT'>  pi  < " 


x-0 


j-2 


x+O-c.Jr 

r(r  + t— — L2-1+1)  p 


x+(l-c,l.)r 
[ .■■  ■ 1 


CU  r-l(.  ‘ ‘ 

«+c-c:„)T  / * (|-,)  dt 

r(r)  r([ — ]+D  0 


15 


It  is  immediately  evident  from  the  above  that  P^(CS|R^)  is  an  increas- 


ing function  of  p^,  for  j « 2,...k  and  consequently  P^(CS|  ,s 


partially  minimized  as  we  let  each  pj  approach  p^  from  above.  Hence 
we  have  the  following  result. 
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Theorem  4.2.1.  inf  P (CS | R .)  - inf  P (CS | R .) 


inf  E (r  x_l)p  qX  { 
0<p<l  x-0  X 


where  Qq  - {£  » (p p),  0 < p < U. 


x+0-c,r)r 
[ 

rT')oV  < E (r:y'')pV}k'' 


(**.2.7) 


It  is  difficult  to  determine  analytically  and  also  numerically 
where  the  infimum  of  the  above  expression  with  respect  to  the  common 
p (0  < p < 1 ) takes  place.  If  we  could  find  the  infimum  then  we  can 
solve  for  the  constant  c^  to  satisfy  (A. 2. 3).  In  order  to  overcome 
this  difficulty,  we  give  a method  leading  a conservative  solution  for 
the  constant. 

For  any  fixed  non-negative  integer  t,  let 


N(t,c(t),r)  - E 

x-0 


(T'>  <Tr'>  (i,-z-s> 


where  [a]  Is  defined  as  before  and  a A b = min  (a,b). 

Theorem  *<.2.2.  For  given  P*,  let  P*  = (P*)  k"'  , and  t ^ 0,  let  c^(t) 


be  the  largest  value  such  that 


N(t,  cll4(t),  r)  _>  (2r+*  ')  P^ 
If  c..  - Inf  (c.r(t):  t > 0),  then 


(A. 2. 9) 


inf  P (CS|R  .)  > P'  . 
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Proof.  For  p e ft 

■■■■■  n 


P£(CS,R'1*)  ' VO  i Cl*.  "*  *(j)  * (|-CU'''> 


H 2<j<k 

x+(l-c.  Jr 

[ — “ — 1 

OO  ^||l 

r ,r+x-k  r x _ 

E ( )P  q Z 

x-0  y-0 


r ir7-')Pv 


_ x+(1-c,  Jr 

[-—ILh 

C14 

1 % r x 
)P  q 

Z 

y-0 

14  X2  " 

(r+J_1)  pV  } 


{ Z^  pp (X,  > 0,4  X2-(1-cl4)r  I XI+X2-t)-Pp(X1+X2-t)} 


> { Z P^(X,  >c|1((t)  V(,“c)*(t))r|X|«2-t>Pt(X|+X2-t)r 

“ t+(l-c..  (t))r 

■ <tf0  VX2  1 |-K~ftT  I *,«2-t)-Pa(X,«2.t))k 

“ N(t»  c..(t),  r)  . 

■ {tfo  (trlt-l, VX'  + X2  * ')’ 

> (Pj)k-1  * P*. 

From  Theorem  4.2.1.,  the  proof  is  completed. 

(B)  Some  Properties  of  the  Procedure  Rjl( 

We  now  discuss  some  properties  of  the  rule  R^.  For  £ e ft. 


P^O)  - P^  (Rji,  selects 


Theorem  4.2.3.  P^U ) is  { + In  Pj  when  all  other  components  of  £ are 

f I xed . 

f in  Pj  (j»*I ) when  all  other  components  of 
are  fixed. 


Proof.  p^(l)  - P£(R,i,  selects  tt^j) 


y /r+x-1 . r x „ 

z ( x )p*q.  n 

x-0  x 11  j- 


x+(1“c,il)r 

t r15—] 


i (r*x-')P 


k 

r X n 

i qi  n 

1 j-i 

L 1*1 


k CW*  , 

" j (rT  > pf  il 

j-1  y«*0  Y -*  J 

)+\ 

x+0“c,i.)r 

m r(rM-^f-W) 


J"I  x+(l -c. . ) r 

j*  r(r)r([ — ^ ■]  +i) 


x+d"c,i.)r 

t--.- u i 


■/  V-'o-t)  c'- 


It  Is  obvious  that  p^(i)  is  Increasing  In  Pj  (j»M ) when  all  other 
components  of  £ are  fixed.  On  the  other  hand,  p^(i)  can  be  written  as 

Pj , • • • ,Pj , • • . pP^) 

x+O-c.Jr 

£ — c — “ — ■] 

k c14 

where  f(x;  p,,...,p. ) - n E (r+*-1)  P^lT  and  p. 

K j-i  y-o  y J J • 

j* 


denotes  that  p?  is  deleted.  Note  that  f (x;  pf Pj,...,Pk)  is  a 

non-decreasing  function  In  x and  the  cdf  Fp(x)  of  a negative  binomial 
random  variable  is  stochastically  increasing  in  1-p.  Then  for 


oo 

Ep  f(X;  Pj p(,...,p  ) - Z [f (x;  p,,...,p, p ) 

i x-0  1 K 

f(x+1;  P|,...,Pj#...,p^)]  F (x) 
oo  K i 

— ^ Pt » • • • #Pj » • • • »p. ) 

x-0  1 K 

f(x+l;  Pj » • . . ,Pj , . . . ,p^) ] Fp^(x) 

A 

* Pj  »•  • • »Pj  » • • • (P^)  . 

Henc*  ,s  n°n- Increasing  In  Pj  when  all  other  components  of  £ 

are  fixed. 

Corollary  k.2. 1 . For  every  £ e fi  and  1 <_  I < j <k,  p (i)  >p  (j). 
Proof • The  proof  follows  easily  from  the  above  theorem  hence  is 
omitted.  We  also  have  the  following  corollary. 

Corollary  A. 2. 2.  For  every  £ e fi  and  I < j <_  k, 

P£*R14  does  not  se,ect  "(uJiPp^ji,  d°es  not  select 

Remark  A.2. 1 ■ It  also  follows  from  Theorem  l».  2. 3.  that  p^(l),  the 

probability  of  a correct  selection,  attains  its  minimum  when  p„ 

2’  ,rk 

tend  to  pj  from  above. 


**•3  A Conditional  Subset  Selection  Procedure 

In  this  section,  we  use  the  same  notation  as  in  Section  k.2. 

We  propose  a rule  Rjg,  similar  to  the  rule  RJlf  except  that  this  rule 

k 

is  based  on  the  total  number  of  observations  T - I X.,  as  follows: 


Rj,.:  Selection  tt j if  and  only  if 


x:  2Lcic(t)  max  X.  - ( 1 -c, (t ) ) r , given 

3 !<;<k  J 13  i 


k 

E.  Xi  * 


4 
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where  t ^ 0 and  0 <_  c j 5 ( t ) <_  I is  chosen  to  satisfy  the  basic  probability 
requirement.  For  this  rule  Rj,-,  we  obtain  an  exact  result  for  k = 2 
in  Theorem  I*. 3.1..  For  k ^ 3,  we  have  a lower  bound  for  the  probability 
of  a correct  selection  in  Theorem  I*. 3.2. 

Theorem  *4.3.1.  For  a given  P (jr<P<l),k  = 2,  and  any  t > 0,  let 
c • r ( t ) be  the  largest  value  such  that 


N(t,  c,5(t),  r)  > P"(2r^t_I)  . 

Then,  inf  P (CS | R ) = inf  P (CS|R  ) > p*. 


(A . 3 . 1 ) 


fi.  15  *0  £. 


Proof.  Let  D(t)  = < 


tCl  5 (*)"  ('  "C1  5 (t) ) r 


> , where  < a > denote  the 


smallest  integer  ^ a,  then  for  any  £ e 
P£(CSiR15}  = Pp(X(l)  -c15(t)  X (2)  ' (1-ci5(t))r  I x(| )+x(2)==t) 


tcic  <t)“ (1 “C. r (t) ) r 

= P (X  /.  V > 15 J-U 

£ 0)  - I + c,5(t) 
t 

x-D(t)  £ vU (2’ 

J P£<X0)  ’ x>  x(2)  ■ *-*> 


X(l)  * X(2)  * () 


v /r+x-l  wr+t-x-k  r x r t-x 

Jo  x t-x  P|  q!  P2  <2 

j0  <rT'i<rT-r'>  t «p 


£ (r*x'')  Xx 

x»DO  t~x 

£ (rT'>  xX 

x-0  x 1 x 


ql 

where  \ = — 
q2 


I 


D(IH  (r+*-1)(r+t-x-l)  Xx 
x°Q  x *'x 

i (T'xTx'1)  *x 

x-D(t)  * x 


0 + <Mx)r> 


(4.3.2) 


By  differentiating  <> (X)  with  respect  to  X,  we  get 

*'<»)  — {°<EHx(r+x-|)rt-x-V'i> 

< e (rT')(rTr')^t2  x'° 

x-D(t)  X t_X 

{ e (rT')(Tr') >x) 

x-O(t)  X 1 x 


{ l x(r+x-')(r«-x-,)Ax-'} 

x=D (t ) X 1 X 


D ( t ) — 1 

{ E (PT  )( Tx  )XX> 
x*=0 


Hence  $(X)  is  non- increasing  in  X and  the  right  hand  side  of  (4.3.2)  is 

non-decreasing  in  X.  Since  X 1 , the  infimum  of  P^(CS|R^)  occurs  at 

X ■ I i.e.  when  p,  - p,.  Thus  inf  P (CS|R1c)  - inf  P (CS|R,r).  Note 

' 2 £efi  £ 15  ^ B.  15 

that  this  infimum  probability  does  not  depend  on  the  common  value  of 


P,  “ P2  m P. 


For  k = 2 and  selected  values  of  r and  t,  tables  of  the  constants 
Cjj(t)  satisfying  (4.3.1)  are  given  at  the  end  of  the  chapter. 

Theorem  4.3.2.  For  £ e f2. 
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Vcs|  V ± 2‘k  * ,f2  <J0  Vx(»  - c,5C>x0,-('-c15(.))rIX(|)*X(j).«) 

• vxo>*xu>  ■ »•  ,2,  xi  ■ t>} 

Proof . The  proof  is  straightforward  and  hence  omitted. 

Theorem  4.3.3.  For  given  P",  < P*  < l , let  P*  = 1-  0 <_  l <_  t. 


let  Cjj.(£)  be  the  largest  value  such  that 
N(£,  c15(£),  r)  > P*  (2rJ£'’) 


If,  Cj|.(t)  “ min  (c^(£):  0 <_  £ <_  t},  then 


inf  P£(CS|RJ5)  > P . 


££0 


Proof.  It  follows  from  Theorem  4.3.2- that  for  £_ e fl. 


k t 


VCS|  V i 2-k*  jf2(1.E0VX0)  - CI5(t)X0)-(,-ClS(t))r|X(,)«(j)-<2) 


P„(X,,,+X,.,-l,  I X,-t)/P  < I X -t)> 
£.  vU  (j ) i*l  1 E.  i = | 1 


k t 


-2'k*  U’.-O  VX0)  -C'5(t)X(j)'(,'C>5<l,,rlX(l)W(j)-t> 

k k 

* Pn<xm+xm-*>  z xi °t)/P(  z xrt)} 

£.  u ) U)  I»l  £ i = l 


and  from  Theorem  4.3.1.,  for  any  j,  j * 2,...,k, 

'"n  VX0)  -cl5(t)  X0)  • (,-c15(t)>r  I x<!)  k X(j) 

- inf  P (X,  > c,5(£)  X - (l-c15 (£) ) r | X,  + Xj  - £) 
^ o 

N(£,  c.,. (£) • r)  * 

st  ? > D 

,2r+£-l v - 2 * 

' £ ’ 


= £) 


p 
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we  have 


Po(csIri5)  i 2_k+  E £ p?  * pn<xm+xm-*>  s x,-t)/p „(  e x,-t) 


j“2  t-0 
2-k+(k-l)  P* 


2 £*' (D  0) 


i“l 


£ i = i 


* 

p . 


Thus  the  proof  is  completed. 

* 

Hence,  for  each  k,  P , Theorem  **.3.3.  guarantees  the  existence  of 

Cj^(t)  for  the  rule  Rj^  and  gives  a method  to  find  Cjj(t)  for  given 

k * 

£ X.  = t such  that  P (CS|Rj,-)  P for  any  £ e ft. 

I = 1 ^ 

An  Upper  Bound  on  the  Expected  Subset  Size  for  R^ 

For  the  procedure  R^,  the  subset  size  S of  the  selected  subset  is 
a random  variable  which  takes  on  only  integer  values  from  1 to  k, 
inclusively.  For  any  fixed  values  of  k,  and  P , then  expected  size 
of  the  selected  subset  is  a function  of  the  true  configuration 


£ = (Pj » • • • tP^)  • 

Lemma  3.3.1.  For  k * 2,  a 


t c. (t)  - (1-c,  _ (t) ) r 

< f2 rr2 >, 

i+F^(t) 


t+(l-c|f-(t))r 

b “ t_ rrl^Ttr1- 


sup  E (S | R. r)  1 1 + 

£eft  £ 15 


1 + 


t 

£ 

x-b+1 

b 

E 

x=a 


J 

,r+x-l v /r+t-x-1 v 
1 x ' V t-x  ’ 


/ r+x- 1 ^ / r+t-x-1 
' x }'  t-x 


Proof.  For  £ e ft,  let  X 


ql 

— , then  X > 1 
q2 


E£(S|R15}  = P£(X(1)  - C15(t)X(2)-(,‘cl5(t))r,X(!)+X(2)“t) 

+ Vx(2)  lci5(t)  X(1)-(|-ci5(t))rlX(,)-X(2)  = t} 


d /V  v tcl 5 (t)"  (|-c, 5 (t) ) r 

V o)  - i ♦ c)5(o 


X(1)+X(2)=t) 


t+(l-c.,(t))r  , 

* yx<» 1 i * c,5u)  i xo)ix(2)  ■ *> 


tc  (t)-(l-c15(t))r  t+(l-c  (t))r 

1 * V "'"-c-I5ft>  -x(l)  - I «■  c15(t)  X(l)*X(2)‘t) 


' * Va  - X0)  - b 1 X<1>  tX(2)  ■ '> 


y /T+x-1  wr+t-x-K  r x r t-x 

Ja  1 X )(  t-x  > P|  ”|  e2  q2 

£ ,r+x-l  wr+t-x-l  > r x r t-x 

Jo  « > p'  "■  P2  "2 


i (rT'>r;:r> 

x=a  x c x 

r (r*x-')(r*:-x-')  ax 

x-0  x '"X 


*i’  (r*x-')(r«;*-!)  Xx  \ (fx-l)(i-t-x-l,  Xx 

, + ~2 * t_x . x-b-H  x t-x 

r /r+x-lv,r+t-x-K  .x  p ,r+x-K  ,r+t-x-l.  .x 
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Let  g (X ) 


E (r+x",)(r+t-x-l)  Xx 
x-0  X t-X 

E (r+x“,)(r+t-x-1)  Xx 
x t-x 

x=a 


and  h (X ) ■ 


E ( r+x- 1 ) ( r+t-x- 1 ) ^x 
x-b-H  x t'x 

e (r+x-,)(r+tt:j-,)xx 

x=a 


Then,  by  differentiating  g(X)  and  h (X ) with  respect  to  X,  it  is  easily 
seen  that  g ( X ) is  a non- increasing  function  of  A,  h(A)  is  a non- 
decreasing function  of  X.  Hence,  from  (A. 3.2)  the  Lemma  follows. 

£+(l-c.,(£))r 

Theorem  <4.3.^.  for  k ^ 2,  0 £ £ £ t,  let  b(£)  = f ] + ‘c — — 1 
£c.  (£)-(l-c)t.U))r 


sup  E (S|Rlc)<y  max  {l  + 
0<£<t 


e <rTl  ><"£"> 

x-bjtl-H 

e (rT')(rtr'' 

x=a(£)  X * X 


Proof.  For  £ e f}. 


E (S|R  ) = E P (X,.v  > c.c(t)  max  X , (1-c. c(t) ) r | E X.  = t) 
£ 15  . = 1 £ (i)  - 15  .d.  IjJ  15  ; =i  1 


ir,;,  .j.  VxcD  -cis(t)  x(jr(,-ei5(t))ri,j1  xi = 0 


(k-l)P.  ( E X.=t) 
H i = j ' 


E E E P_  (X/j  \ >c.  r (t)X  , - (1-c. (t) ) r , 
i = l j*i  £*0  £■  (')_  ,5  (j)  15 


(k-1)  P ( E X.-t) 
E |-| 


X,.,+X,.,-ll,  E X-  .=t-£) 

(i)  (j)  Sftj  J (*) 

k t 

■,.E,  )ll  MVX(')-Cl5(,,X(J)-<'-cl5(*»r| 
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k"  .Z.  E {PD(Xn)-ciq(t)xor^-c.cft))r| 

(k-l)P  (z  X -t)  ’<J  £m°  £ 5 (j)  15 

£ 1*1  ' 

x(t)  + x(j)  = £) 

* Vx0)  ici5(t)xd )-(|-ci5<t»rl*(„  ♦ x(J)  - m 

•VX0)  * xu>  ■ *'  ,E,  x(i)  * *> 


k .E  E {sup  [pD(xfn  L c.  (t)  x,., 

(k-l )P  ( E X.*t)  l<J  1=0  ^ 5 

£ M ' 

- 0-c)5(t))r|X(})  + X(j)  = l) 
+ P£(X0)  ~C15(t)  X(i)"(,"C15(t)>rlX(i)^X(j)  * £)]} 


P£(X0)  + X(j)  = £*  .f,  Xi  " 


I c 

k ' ^ 2 {1  + 

(k-l)P  ( E X -t)  f<j  £“° 

£ i = i 1 


E (r+x-l  wr+fx-l, 

, x=b  (&)+l x *"x 

i + I').'  


b (Z) , 

r /r+x-l  wr+£-x-K 

x-,«!)  ‘ « )(  »-«  ’ 


• Vxc)+X(j)  ■ *•  .z,  Xi  ' 


by  Lemma  k.3.1 . , 


1 rrr  *■  max  { i + 

i<j  0<£<t 


I — } 

K ( 0 \ 4.1  * £-X 


, , x«b(A)+l  x K~* 

' + ~bjn : 

= (rT,Hrtr,> 

x*a  (i)  x * * 


The  proof  is  thus  completed. 


b.k  Selection  of  all  Populations  Better  Than  a Standard 


Case  1 . Known  Standard. 

We  consider  the  problem  where  k negative  binomial  populations  tt. 

with  parameters  r[*  Pj  ( i = lf...,k)  are  to  be  compared  with  a standard 

ttq  with  parameters  r , pq  where  pq  is  the  known  value  of  the  probability 

of  a success  in  the  standard  population.  Since  p.  is  the  probability 

of  a success,  we  define  tt.  to  be  better  than  it  when  p.  > p . Let 

i o r i ro 

X.  denote  the  number  of  failures  before  the  r.th  success  is  observed 
in  population  tt..  Then  for  selecting  all  populations  better  than  the 
standard  we  define  the  following  procedure: 

Rp  : Retain  in  the  selected  subset  those  and  only  those  populations 

tt.  (i  = l,...,k)  for  which 


X.  < p + D. 
i—o  1 


(4.4.I) 


Let  and  &2  denote  the  number  of  populations  with  p.  pq  and  p.  < pq, 
respectively,  so  that  + SL^  = k.  In  general,  and  are  unknown. 
Then  the  probability  Pj  of  a correct  decision  is  given  by 

p,  = n p{x:  < p +o,} 

f , , I — O > 

1 = 1 

z tp  +o,]  . , 

1 o 1 r.+x-l  r. 

= n { e ( ' ) P:  ‘ qrx  } 

• 1 n A 

1=1  x=0 


where  primes  refer  to  the  £.j  populations  with  p.  _>  pQ.  Let  m = [p0+Pj], 
Then 


1 r(r.+m+l ) pi  r -1 

p.=  TTF^TT^hTT  1 ' 0_t)  dt 
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A lower  bound  on  the  above  probability  is  obtained  by  setting  p.  - pQ 
(i  - l,...,k)  and  £ ^ - k.  Hence  the  inequality  determining  Dj  becomes 

(*♦.4.2) 


k I'V'V 

n ( z 

iBl  x=0 


r . +x- 1 

/ I s r x ^ , 

( ) p q > P* 

x ro  o — 


and  the  solution  is  the  smallest  value  of  Dj  satisfying  (4.4.2).  If 
r.  - r (i  - 1 k),  then  (4.4.2)  reduces  to 


x=0 


l 

/r+x-1,  r x . , *»k 
( ) p q > (P  ) . 

x o^o  — 


This  is  easily  solved  by  consulting  a table  of  cumulative  negative 
binomial  probabilities. 

Case  2.  Unknown  Standard. 

The  assumptions  are  the  same  as  in  case  1 except  that  pQ  is  not 

known.  Let  X be  the  number  of  failures  before  the  r th  success  is 
o o 

observed  in  population  ttq.  Consider  the  following  procedure: 

RD  : Retain  in  the  selected  subset  those  and  only  those  populations 

n.  (i  = 1 k)  for  which 


X.  < X + (4-  “ l)  r. 

t — D2  o 'D2  i . 


(4.4.3) 


The  probability  P2  of  retaining  all  the  populations  with  p.  pQ 

attains  a minimum  where  Pj  = PQ  (i  = 1 k)  and  ■ k and  is  given 

by 

[F*  X+(F" 

oo  k u2  u2  r.+y-l  r r +x-l  r 

P (p  o2).  Z K I ( )p0  «ln  * ) p0°  < ■ 

x-0  1-1  y-0  1 


Then  the  desired  value  of  D2  for  the  rule  defined  in  (4.4.3)  is  the 


largest  number  for  which 


min  P (p  , D.)  > P 

0<po<l 


For  r.  ■ r (i  « 0,1 k) , 


_ _ , _ ,r+x-l.  r x 

P2  i 1 r„  ( x )Po  "o 
x-0 


<T>-x ,k 


■ (l>(x.  iorxo*  'or  - »'» 

I 


For  a given  P , let  Pj  ■ (P  )k  and  for  any  t ^0,  let  D2(t)  be  the  largest 
value  such  that 

N(t,  02 (t) . r)  > (2r*M)  P*  , 

where  N(t,  D2(t),  r)  is  defined  in  (4.2.8).  Let  D2  * inf  {D2  ( t) : t 0}. 

* 

Then  a conservative  value  of  is  obtained  such  that  P^  >_  P . The 
arguments  for  the  above  statement  are  essentially  the  same  as  those  in 
Section  4.2. 


4.5  Appl i cat  ion 

if  a structure  consists  of  n components  it  will  be  called  a 
structure  of  order  n.  The  state  of  all  components  of  such  a system 
will  be  described  by  a vector  x_ » (xj,...,xn)  where  x.  = 1 means 
"ith  component  performs"  and  x.  * 0 means  "ith  component  fails".  In 
reliability  theory,  structures  satisfying  the  following  conditions 
(I)  0(1)  - I where  1 - (1 1) 


(I)  0(0  - I where  - (1 1) 

(ii)  0(0.)  - 0 where  0.  - (0,...,0) 

(I  i i ) 0 (£.)  1.  0 (y)  whenever  x.  _>  y, , I - 1 , . . . ,n 
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are  called  coherent  (see  Bl rnbaum,  Esary  and  Saunders  [17])  or  monotonic 
(see  Barlow  and  Proschan  [5l). 

Now  consider  k independent  structures  each  of  which  consists  of 

n components  in  parallel.  We  put  each  of  the  components  of  the 

structure  on  test.  Suppose  a component  might  not  perform  at  the  first 

trial.  Let  Y.^  denote  the  number  of  failures  of  the  jth  component  of 

the  structure  tt j before  it  performs,  j ■ 1,2 i ■ 1,2,...,k. 

It  Is  easy  to  see  that  Z.  * min(Y.j Y j n)  is  a geometric  random 

n 

variable  with  parameter  p.  * 1 - n ( 1 ~p - • ) where  p. . is  the 

' j-1  ,J  IJ 

probability  that  the  jth  component  of  the  ith  structure  will  perform. 
Hence,  the  problem  of  selecting  the  structure  with  the  smallest 
reliability  is  the  same  as  that  of  selecting  the  geometric  population 
with  the  smallest  parameter  as  discussed  in  this  chapter  when  r-1 . 
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